skip to main content
10.1145/3211922.3211925acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
short-paper

Efficient k-means on GPUs

Published:11 June 2018Publication History

ABSTRACT

k-Means is a versatile clustering algorithm widely-used in practice. To cluster large data sets, state-of-the-art implementations use GPUs to shorten the data to knowledge time. These implementations commonly assign points on a GPU and update centroids on a CPU.

We show that this approach has two main drawbacks. First, it separates the two algorithm phases over different processors, which requires an expensive data exchange between devices. Second, even when both phases are computed on the GPU, the same data are read twice per iteration, leading to inefficient use of memory bandwidth.

In this paper, we describe a new approach that executes k-means in a single data pass per iteration. We propose a new algorithm to updates centroids that allows us to perform both phases efficiently on GPUs. Thereby, we remove data transfers within each iteration. We fuse both phases to eliminate artificial synchronization barriers, and thus compute k-means in a single data pass. Overall, we achieve up to 20x higher throughput compared to the state-of-the-art approach.

References

  1. 2018. Amazon EC2 Pricing. (May 8 2018). https://aws.amazon.com/ec2/pricing/on-demandGoogle ScholarGoogle Scholar
  2. 2018. Microsoft Azure Pricing. (May 8 2018). https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/Google ScholarGoogle Scholar
  3. David Arthur and Sergei Vassilvitskii. 2007. k-Means++: The advantages of careful seeding. In ACM-SIAM. 1027--1035. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Hong-tao Bai, Li-li He, Dan-tong Ouyang, Zhan-shan Li, and He Li. 2009. k-Means on commodity GPUs with CUDA. In WRI CSIE. 651--655. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Sebastian Breß et al. 2017. Generating custom code for efficient query execution on heterogeneous processors. CoRR abs/1709.00700 (2017).Google ScholarGoogle Scholar
  6. Sebastian Breß, Henning Funke, and Jens Teubner. 2016. Robust query processing in co-processor-accelerated databases. In SIGMOD. 1891--1906. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Feng Cao, Anthony K. H. Tung, and Aoying Zhou. 2006. Scalable clustering using graphics processors. In WAIM. 372--384. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Christophe Cassou. 2008. Intraseasonal interaction between the Madden-Julian Oscillation and the North Atlantic Oscillation. Nature 455, 7212 (Sept. 2008), 523--527.Google ScholarGoogle ScholarCross RefCross Ref
  9. Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W. Sheaffer, Sang-Ha Lee, and Kevin Skadron. 2009. Rodinia: A benchmark suite for heterogeneous computing. In IISWC. 44--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M Dall, DCS Beddows, Peter Tunved, Radovan Krejci, Johan Ström, H-C Hansson, YJ Yoon, Ki-Tae Park, S Becagli, R Udisti, et al. 2017. Arctic sea ice melt leads to atmospheric new particle formation. Scientific reports 7, 1 (2017), 3318.Google ScholarGoogle Scholar
  11. Reza Farivar, Daniel Rebolledo, Ellick Chan, and Roy H. Campbell. 2008. A parallel implementation of k-means clustering on GPUs. In PDPTA. 340--345.Google ScholarGoogle Scholar
  12. Henning Funke, Sebastian Breß, Stefan Noll, Volker Markl, and Jens Teubner. 2018. Pipelined query processing in coprocessor environments. In SIGMOD. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jesse Hall and John Hart. 2004. GPU acceleration of iterative clustering. In GPGPU. 45--52.Google ScholarGoogle Scholar
  14. Bingsheng He, Mian Lu, Ke Yang, Rui Fang, Naga K. Govindaraju, Qiong Luo, and Pedro V. Sander. 2009. Relational query coprocessing on graphics processors. TODS 34, 4 (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Max Heimel, Michael Saecker, Holger Pirk, Stefan Manegold, and Volker Markl. 2013. Hardware-oblivious parallelism for in-memory column-stores. PVLDB 6, 9 (2013), 709--720. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Nathaniel D Heintzman, Rhona K Stuart, Gary Hon, Yutao Fu, Christina W Ching, R David Hawkins, Leah O Barrera, Sara Van Calcar, Chunxu Qu, Keith A Ching, et al. 2007. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nature Genetics 39, 3 (2007), 311.Google ScholarGoogle ScholarCross RefCross Ref
  17. Joseph Hellerstein, Christopher Ré, Florian Schoppmann, Daisy Zhe Wang, Eugene Fratkin, Aleksander Gorajek, Kee Siong Ng, Caleb Welton, Xixuan Feng, Kun Li, and Arun Kumar. 2012. The MADlib analytics library or MAD skills, the SQL. PVLDB 5, 12 (2012), 1700--1711. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Tomas Karnagel, René Müller, and Guy M. Lohman. 2015. Optimizing GPU-accelerated group-by and aggregation. In ADMS. 13--24.Google ScholarGoogle Scholar
  19. Kristin M. Kleisner, Michael J. Fogarty, Sally McGee, Analie Barnett, Paula Fratantoni, Jennifer Greene, Jonathan A. Hare, Sean M. Lucey, Christopher McGuire, Jay Odell, Vincent S. Saba, Laurel Smith, Katherine J. Weaver, and Malin L. Pinsky. 2016. The effects of sub-regional climate velocity on the distribution and spatial extent of marine species assemblages. PLOS ONE 11 (02 2016), 1--21.Google ScholarGoogle Scholar
  20. You Li, Kaiyong Zhao, Xiaowen Chu, and Jiming Liu. 2010. Speeding up k-means algorithm by GPUs. In IEEE CIT. 115--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Stuart Lloyd. 1982. Least squares quantization in PCM. IEEE Trans. Inf. Theory 28, 2 (1982), 129--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. James MacQueen et al. 1967. Some methods for classification and analysis of multivariate observations. In Proc. Fifth Berkeley Symp. on Math. Statist. and Prob., Vol. 1. 281--297.Google ScholarGoogle Scholar
  23. Disa Mhembere, Da Zheng, Carey E. Priebe, Joshua T, Vogelstein, and Randal Burns. 2017. knor: A NUMA-optimized in-memory, distributed and semi-external-memory k-means library. In HPDC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Carlos Ordonez. 2004. Programming the k-means clustering algorithm in SQL. In SIGKDD. 823--828. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Linnea Passing, Manuel Then, Nina Hubig, Harald Lang, Michael Schreier, Stephan Günnemann, Alfons Kemper, and Thomas Neumann. 2017. SQL- and operator-centric data analytics in relational main-memory databases. In EDBT. 84--95.Google ScholarGoogle Scholar
  26. Holger Pirk, Stefan Manegold, and Martin L. Kersten. 2014. Waste not... Efficient co-processing of relational data. In ICDE. 508--519.Google ScholarGoogle Scholar
  27. Holger Pirk, Oscar Moll, Matei Zaharia, and Sam Madden. 2016. Voodoo - A vector algebra for portable database performance on modern hardware. PVLDB 9, 14 (2016), 1707--1718. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Conrad Sanderson and Ryan Curtin. 2016. Armadillo: a template-based C++ library for linear algebra. Journal of Open Source Software (2016).Google ScholarGoogle Scholar
  29. Arul Shalom, Manoranjan Dash, and Minh Tue. 2008. Efficient k-means clustering using accelerated graphics processors. In DaWaK. 166--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Michael Shindler, Alex Wong, and Adam W. Meyerson. 2011. Fast and accurate k-means for large datasets. In NIPS. 2375--2383. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Sarah A Vitak, Kristof A Torkenczy, Jimi L Rosenkrantz, Andrew J Fields, Lena Christiansen, Melissa H Wong, Lucia Carbone, Frank J Steemers, and Andrew Adey. 2017. Sequencing thousands of single-cell genomes with combinatorial indexing. Nature Methods 14, 3 (2017), 302.Google ScholarGoogle ScholarCross RefCross Ref
  32. Fuhui Wu, Qingbo Wu, Yusong Tan, Lifeng Wei, Lisong Shao, and Long Gao. 2013. A vectorized k-means algorithm for Intel Many Integrated Core architecture. In APPT. 277--294. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Chongzhi Zang, Tao Wang, Ke Deng, Bo Li, Sheng'en Hu, Qian Qin, Tengfei Xiao, Shihua Zhang, Clifford A. Meyer, Housheng Hansen He, Myles Brown, Jun S. Liu, Yang Xie, and X. Shirley Liu. 2016. High-dimensional genomic data bias correction and data integration using MANCIE. Nature Communications 7 (April 2016), 11305.Google ScholarGoogle Scholar
  34. Tian Zhang, Raghu Ramakrishnan, and Miron Livny. 1996. BIRCH: An efficient data clustering method for very large databases. In SIGMOD. 103--114. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    DAMON '18: Proceedings of the 14th International Workshop on Data Management on New Hardware
    June 2018
    75 pages
    ISBN:9781450358538
    DOI:10.1145/3211922

    Copyright © 2018 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 11 June 2018

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • short-paper

    Acceptance Rates

    Overall Acceptance Rate80of102submissions,78%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader