Multi-grain Parallel Processing of Data-Clustering on Programmable Graphics Hardware

Takizawa, Hiroyki; Kobayashi, Hiroaki

doi:10.1007/978-3-540-30566-8_5

Hiroyki Takizawa²⁰ &
Hiroaki Kobayashi²¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3358))

Included in the following conference series:

International Symposium on Parallel and Distributed Processing and Applications

680 Accesses
3 Citations

Abstract

This paper presents an effective scheme for clustering a huge data set using a commodity programmable graphics processing unit(GPU). Due to GPU’s application-specific architecture, one of the current research issues is how to bind the rendering pipeline with the data-clustering process. By taking advantage of GPU’s parallel processing capability, our implementation scheme is devised to exploit the multi-grain single-instruction multiple-data (SIMD) parallelism of the nearest neighbor search, which is the most computationally-intensive part of the data-clustering process. The performance of our scheme is discussed in comparison with that of the implementation entirely running on CPU. Experimental results clearly show that the parallelism of the nearest neighbor search allows our scheme to efficiently execute the data-clustering process. Although data-transfer from GPU to CPU is generally costly, acceleration by GPU is significant to save the total execution time of data-clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Anderberg, M.R.: Cluster Analysis for Applications. Academic Press Inc., London (1973)
MATH Google Scholar
Kohonen, T.: Self-Organizing Maps. Springer, New York (1995)
Google Scholar
Fayyad, U., Haussler, D., Stolorz, P.: KDD for science data analysis: Issues and examples. In: The Second International Conference on Knowledge Discovery and Data mining (KDD 1996). AAAI Press, Menlo Park (1996)
Google Scholar
Gersho, A., Gray, R.M.: Vector Quantization and Signal Compression. Kluwer Academic Publishers, Norwell (1992)
MATH Google Scholar
Everitt, B., Landau, S., Leese, M.: Cluster Analysis, 4th edn. Oxford University Press Inc., NY (2001)
MATH Google Scholar
Kobayashi, K., Kiyoshita, M., Onodera, H., Tamaru, K.: A memory-based parallel processor for vectror quantization: FMPP-VQ. IEICE Trans. Electron. E80-C, 970–975 (1997)
Google Scholar
Abbas, H.M., Bayoumi, M.M.: Parallel codebook design for vector quantization on a message passing MI MD architecture. Parallel Computing 28, 1079–1093 (2002)
Article MATH Google Scholar
Parhi, K., Wu, F., Genesan, K.: Sequential and parallel neural network vector quantizers. IEEE trans. Computers 43, 104–109 (1994)
Article Google Scholar
Manohar, M., Tilton, J.: Progressive vector quantization on a massively parallel SIMD machine with application to multispectral image data. IEEE transactions on Image Processing 5, 142–147 (1996)
Article Google Scholar
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: The fifth Berkley Symposium on Mathematical Statistics and Probability, Berkley, vol. 1, pp. 281–297. The University of California Press (1967)
Google Scholar
Thompson, C.J., Hahn, S., Oskin, M.: Using modern graphics architectures for general-purpose computing: A framework and analysis. In: International Symposium on Microarchitecture(MICRO), Turkey (2002)
Google Scholar
Moreland, K., Angel, E.: The FFT on a GPU. In SIGGRAPH/Eurographics Workshop on Graphics Hardware 2003 Proceedings, pp. 112–119 (2003)
Google Scholar
Bohn, C.A.: Kohonen feature mapping through graphics hardware. Computational Intelligence and Neuroscience (1998)
Google Scholar
NVIDIA Corporation: GeForce 6800 product web site (2004), http://www.nvidia.com/page/geforce_6800.html
Forgy, E.: Cluster analysis of multivariate data: Efficiency vs. interpretability of classification. Biometrics 21, 768–769 (1965) (Abstract)
Google Scholar
Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. IEEE Transactions on Communications COM-28, 84–95 (1980)
Article Google Scholar
Patané, G., Russo, M.: The enhanced LBG algorithm. Neural Networks 14, 1219–1237 (2001)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Infortmation Sciences, Tohoku University, Aoba, Aramaki-aza, Aoba-ku, Sendai, 980-8578, Japan
Hiroyki Takizawa
Information Synergy Center, Tohoku University, Aoba, Aramaki-aza, Aoba-ku, Sendai, 980-8578, Japan
Hiroaki Kobayashi

Authors

Hiroyki Takizawa
View author publications
You can also search for this author in PubMed Google Scholar
Hiroaki Kobayashi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computing, Hong Kong Polytechnic University, Kowloon, Hong Kong, China
Jiannong Cao
Department of Computer Science, St. Francis Xavier University, Antigonish, Canada
Laurence T. Yang
Department of Computer Science and Engineering, Shanghai Jiao Tong University, 200030, Shanghai, China
Minyi Guo
Department of Computer Science, The University of Hong Kong, Pokfulam
Francis Lau

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Takizawa, H., Kobayashi, H. (2004). Multi-grain Parallel Processing of Data-Clustering on Programmable Graphics Hardware. In: Cao, J., Yang, L.T., Guo, M., Lau, F. (eds) Parallel and Distributed Processing and Applications. ISPA 2004. Lecture Notes in Computer Science, vol 3358. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30566-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-540-30566-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24128-7
Online ISBN: 978-3-540-30566-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics