Abstract
The performance gain obtained by the adaptation of the G-means algorithm for a Cell BE environment using the CellSs framework is described. G-means is a clustering algorithm based on k-means, used to find the number of Gaussian distributions and their centers inside a multi-dimensional dataset. It is normally used for data mining applications, and its execution can be divided into 6 execution steps. This paper analyzes each step to select which of them could be improved. In the implementation, the algorithm was modified to use the specific SIMD instructions of the Cell processor and to introduce parallel computing using the CellSs framework to handle the SPU tasks. The hardware used was an IBM BladeCenter QS22 containing two PowerXCell processors. The results show the execution of the algorithm 60% faster as compared with the non-improved code.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Lyman, P., Varian, H.R.: How Much Information (2003), http://www.sims.berkeley.edu/how-much-info-2003 (retrieved from December 2009)
Macqueen, J.: Some Methods of Classification and Analysis of Multivariate Observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)
Hamerly, G., Elkan, C.: Learning the K in K-Means. Neural Information Processing Systems 16, 281–288 (2003)
Simek, F.: Implementation of K-means Algorithm on the Cell Processor. BSc. Thesis. Czech Technical University in Prague (2007)
Buehrer, G., Parthasarathy, S., Goyder, M.: Data mining on the cell broadband engine. In: Proceedings of the 22nd Annual International Conference on Supercomputing, Island of Kos, Greece, pp. 26–35. ACM, New York (2008)
Hong-tao, B., Li-li, H., Dan-tong, O., Zhan-shan, L., He, L.: K-Means on Commodity GPUs with CUDA. In: Computer Science and Information Engineering, WRI World Congress, pp. 651–655 (2009)
Tian, J., Zhu, L., Zhang, S., Liu, L.: Improvement and Parallelism of k-Means Clustering Algorithm. Tsinghua Science &Technology 10, 277–281 (2005)
Pelleg, D., Moore, A.: X-means: Extending K-means with Efficient Estimation of the Number of Clusters. In: Proceedings of the 17th International Conf. on Machine Learning, pp. 727–734 (2000)
Perez, J.M., Bellens, P., Badia, R.M., Labarta, J.: CellSs: Programming the Cell/B.E. made easier. IBM Journal of R&D 51(5), 593–604 (2007)
Buehrer, G., Parthasarathy, S.: The Potential of the Cell Broadband Engine for Data Mining. Ohio State University Technical Report OSU-CISRC-3/07–TR22 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Foina, A.G., Badia, R.M., Ramirez-Fernandez, J. (2010). G-Means Improved for Cell BE Environment. In: Keller, R., Kramer, D., Weiss, JP. (eds) Facing the Multicore-Challenge. Lecture Notes in Computer Science, vol 6310. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16233-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-16233-6_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16232-9
Online ISBN: 978-3-642-16233-6
eBook Packages: Computer ScienceComputer Science (R0)