Abstract
Canopy clustering is a preprocessing method for standard clustering algorithms such as k-means and hierarchical agglomerative clustering. Canopy clustering can greatly reduce the computational cost of clustering algorithms. However, canopy clustering itself may also take a vast amount of time for handling massive data, if we naïvely implement it. To address this problem, we present efficient algorithms and implementations of canopy clustering on GPUs, which have evolved recently as general-purpose many-core processors. We not only accelerate the computation of original canopy clustering, but also propose an algorithm using grid index. This algorithm partitions the data into cells to reduce redundant computations and, at the same time, to exploit the parallelism of GPUs. Experiments show that the proposed implementations on the GPU is 2 times faster on average than multi-threaded, SIMD implementations on two octa-core CPUs.
F. Hayashi—Currently working at International Laboratory Corporation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bell, N., Garland, M.: Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: Proceedings of the SC, 18:1–18:11 (2009)
Böhm, C., Noll, R., Plant, C., Wackersreuther, B.: Density-based clustering using graphics processors. In: Proceedings of the CIKM, pp. 661–670 (2009)
Dash, M., Petrutiu, S., Scheuermann, P.: pPOP: Fast yet accurate parallel hierarchical clustering using partitioning. Data Knowl. Eng. 61(3), 563–578 (2007)
Fan, Z.G., Wu, Y., Wu, B.: Maximum normalized spacing for efficient visual clustering. In: Proceedings of the CIKM, pp. 409–418 (2010)
Harris, M.: Optimizing Parallel Reduction in CUDA. http://developer.download.nvidia.com/compute/cuda/2_2/sdk/website/projects/reduction/doc/reduction.pdf
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann, Burlington (2011)
He, B., Lu, M., Yang, K., Fang, R., Govindaraju, N.K., Luo, Q., Sander, P.V.: Relational Query Coprocessing on Graphics Processors. ACM Trans. Database Syst. 34(4), 21:1–21:39 (2009)
Lomont, C.: Introduction to Intel\(\textregistered \) Advanced Vector Extensions. https://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions
Kohlhoff, K.J., Pande, V.S., Altman, R.B.: K-Means for parallel architectures using All-Prefix-sum sorting and updating steps. IEEE Trans. Parallel Distrib. Syst. 24(8), 1602–1612 (2013)
Li, Y., Zhao, K., Chu, X., Liu, J.: Speeding up k-Means algorithm by GPUs. J. Comput. Syst. Sci. 79(2), 216–229 (2013)
Li, Q., Wang, P., Wang, W., Hu, H., Li, Z., Li, J.: An efficient K-means clustering algorithm on MapReduce. In: Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (eds.) DASFAA 2014, Part I. LNCS, vol. 8421, pp. 357–371. Springer, Heidelberg (2014)
McCallum, A., Nigam, K., Ungar, L.H.: Efficient clustering of High-dimensional data sets with application to reference matching. In: Proceedings of the KDD, pp. 169–178 (2000)
NVIDIA: CUDA C Programming Guide. http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_C_Programming_Guide.pdf
Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: Proc. IEEE GPU Comput. 96(5), 879–899 (2008)
Patwary, M.A., Palsetia, D., Agrawal, A., Liao, W.k., Manne, F., Choudhary, A.: A new scalable parallel DBSCAN algorithm using the disjoint-set data structure. In: SC, pp. 62:1–62:11 (2012)
Shalom, S.A.A., Dash, M.: Efficient partitioning based hierarchical agglomerative clustering using graphics accelerators with CUDA. Int. J. Artif. Intell. Appl. 4(2), 13–33 (2013)
Soroush, E., Balazinska, M., Wang, D.: ArrayStore: a storage manager for complex parallel array processing. In: SIGMOD, pp. 253–264 (2011)
Wasif, M., Narayanan, P.: Scalable clustering using multiple GPUs. In: HiPC, pp. 1–10 (2011)
Welton, B., Samanas, E., Miller, B.P.: Mr. Scan: Extreme scale density-based clustering using a tree-based network of GPGPU nodes. In: SC, 84:1–84:11 (2013)
Wu, H., Diamos, G., Cadambi, S., Yalamanchili, S.: Kernel weaver: automatically fusing database primitives for efficient GPU computation. In: MICRO, pp. 107–118 (2012)
Acknowledgments
This research was partly supported by the Grant-in-Aid for Scientific Research (B) (#26280037) from Japan Society for the Promotion of Science.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kozawa, Y., Hayashi, F., Amagasa, T., Kitagawa, H. (2015). Parallel Canopy Clustering on GPUs. In: Chen, Q., Hameurlain, A., Toumani, F., Wagner, R., Decker, H. (eds) Database and Expert Systems Applications. Globe DEXA 2015 2015. Lecture Notes in Computer Science(), vol 9261. Springer, Cham. https://doi.org/10.1007/978-3-319-22849-5_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-22849-5_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22848-8
Online ISBN: 978-3-319-22849-5
eBook Packages: Computer ScienceComputer Science (R0)