G-Means Improved for Cell BE Environment

Foina, Aislan G.; Badia, Rosa M.; Ramirez-Fernandez, Javier

doi:10.1007/978-3-642-16233-6_8

Aislan G. Foina^19,20,
Rosa M. Badia²⁰ &
Javier Ramirez-Fernandez¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6310))

1276 Accesses

Abstract

The performance gain obtained by the adaptation of the G-means algorithm for a Cell BE environment using the CellSs framework is described. G-means is a clustering algorithm based on k-means, used to find the number of Gaussian distributions and their centers inside a multi-dimensional dataset. It is normally used for data mining applications, and its execution can be divided into 6 execution steps. This paper analyzes each step to select which of them could be improved. In the implementation, the algorithm was modified to use the specific SIMD instructions of the Cell processor and to introduce parallel computing using the CellSs framework to handle the SPU tasks. The hardware used was an IBM BladeCenter QS22 containing two PowerXCell processors. The results show the execution of the algorithm 60% faster as compared with the non-improved code.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

An efficient K-means clustering algorithm for tall data

Article 01 May 2020

Big data: an optimized approach for cluster initialization

Article Open access 20 July 2023

An Improved Parallel K-Means Algorithm Based on Cloud Computing

References

Lyman, P., Varian, H.R.: How Much Information (2003), http://www.sims.berkeley.edu/how-much-info-2003 (retrieved from December 2009)
Macqueen, J.: Some Methods of Classification and Analysis of Multivariate Observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)
Google Scholar
Hamerly, G., Elkan, C.: Learning the K in K-Means. Neural Information Processing Systems 16, 281–288 (2003)
Google Scholar
Simek, F.: Implementation of K-means Algorithm on the Cell Processor. BSc. Thesis. Czech Technical University in Prague (2007)
Google Scholar
Buehrer, G., Parthasarathy, S., Goyder, M.: Data mining on the cell broadband engine. In: Proceedings of the 22nd Annual International Conference on Supercomputing, Island of Kos, Greece, pp. 26–35. ACM, New York (2008)
Chapter Google Scholar
Hong-tao, B., Li-li, H., Dan-tong, O., Zhan-shan, L., He, L.: K-Means on Commodity GPUs with CUDA. In: Computer Science and Information Engineering, WRI World Congress, pp. 651–655 (2009)
Google Scholar
Tian, J., Zhu, L., Zhang, S., Liu, L.: Improvement and Parallelism of k-Means Clustering Algorithm. Tsinghua Science &Technology 10, 277–281 (2005)
Article MathSciNet Google Scholar
Pelleg, D., Moore, A.: X-means: Extending K-means with Efficient Estimation of the Number of Clusters. In: Proceedings of the 17th International Conf. on Machine Learning, pp. 727–734 (2000)
Google Scholar
Perez, J.M., Bellens, P., Badia, R.M., Labarta, J.: CellSs: Programming the Cell/B.E. made easier. IBM Journal of R&D 51(5), 593–604 (2007)
Article Google Scholar
Buehrer, G., Parthasarathy, S.: The Potential of the Cell Broadband Engine for Data Mining. Ohio State University Technical Report OSU-CISRC-3/07–TR22 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Universidade de São Paulo, Av. Prof. Luciano Gualberto, Travessa 3, 158, 05508-970, São Paulo, Brazil
Aislan G. Foina & Javier Ramirez-Fernandez
Barcelona Supercomputing Center and Artificial Intelligence Research Institute (IIIA), Spanish National Research Council (CSIC), Jordi Girona, 31, 08034, Barcelona, Spain
Aislan G. Foina & Rosa M. Badia

Authors

Aislan G. Foina
View author publications
You can also search for this author in PubMed Google Scholar
Rosa M. Badia
View author publications
You can also search for this author in PubMed Google Scholar
Javier Ramirez-Fernandez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

High Performance Computing Center Stuttgart (HLRS), Universität Stuttgart, Nobelstr. 19, 70569, Stuttgart, Germany
Rainer Keller
Institute of Computer Science and Engineering, Karlsruhe Institute of Technology, Haid-und-Neu-Str. 7, 76131, Karlsruhe, Germany
David Kramer
Engineering Mathematics and Computing Lab (EMCL) & Institute for Applied and Numerical Mathematics 4, Karlsruhe Institute of Technology, Fritz-Erler-Str. 23, 76133, Karlsruhe, Germany
Jan-Philipp Weiss

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Foina, A.G., Badia, R.M., Ramirez-Fernandez, J. (2010). G-Means Improved for Cell BE Environment. In: Keller, R., Kramer, D., Weiss, JP. (eds) Facing the Multicore-Challenge. Lecture Notes in Computer Science, vol 6310. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16233-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-16233-6_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16232-9
Online ISBN: 978-3-642-16233-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics