Skip to main content

DVT-PKM: An Improved GPU Based Parallel K-Means Algorithm

  • Conference paper
Intelligent Computing Methodologies (ICIC 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8589))

Included in the following conference series:

Abstract

K-Means clustering algorithm is a typical partition-based clustering algorithm. Its two major disadvantages lie in the facts that the algorithm is sensitive to initial cluster centers and the outliers exert significant influence on the clustering results. In addition, K-Means algorithm traverses and computes all the data multiple times. Thus, the algorithm is not efficient when dealing with large data sets. In order to overcome the above limitations, this paper proposes to exclude the outliers using the minimum number of points in the d-dimensional hypersphere area. Then k cluster centers can be obtained by adjusting the threshold making use of density idea. Finally, K-Means algorithm will be integrated with Compute Unified Device Architecture (CUDA). The time efficiency is improved considerably through taking advantage of computing power of Graphic Processing Unit (GPU). We use the ratio of distance between classes to distance within classes and speedup as the evaluation criteria. The experiments indicate that the proposed algorithm significantly improves the stability and running efficiency of K-Means algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wang, C.F., Tang, Y.Z.: Research of K-means Algorithm Combined with Neighbors And Density. Computer Engineering and Applications 47(19), 147–149 (2011)

    Google Scholar 

  2. MacQueen, J.: Some Methods for Classification And Analysis of Multivariate Observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematics, Statistics and Science, pp. 281–297 (1967)

    Google Scholar 

  3. Cox, D.R.: Note on Grouping. Journal of the American Statistical Association, 543–547 (1957)

    Google Scholar 

  4. Fisher, W.D.: On Grouping for Maximum Homogeneity. Journal of the American Statistical Association, 789–798 (1958)

    Google Scholar 

  5. Sebestyen, G.S.: Decision Making Process in Pattern Recognition, p. 162. Macmillan, New York (1962)

    Google Scholar 

  6. Cheng, M.Y., Huang, K.Y., Chen, H.M.: K-Means Particle Swarm Optimization with Embedded Chaotic Search for Solving Multidimensional Problems. Applied Mathematics and Computation 219(6), 3091–3099 (2012)

    Article  MathSciNet  Google Scholar 

  7. Rajab, M.: Segmentation of Dermatoscopic Image by Frequency Domain Filtering And K-means Clustering Algorithms. Skin Research and Technology 17(4), 469–478 (2011)

    Article  Google Scholar 

  8. Nunes, J., Madeira, M., Gazarini, L., et al.: A Data Mining Approach to Improve Multiple Regression Models of Soil Nitrate Concentration Predictions in Quercus Rotundifolia Montados(Portugal). Agroforestry Systems 84(1), 89–100 (2012)

    Article  Google Scholar 

  9. Selim, S.Z., Ismail, M.A.: K-Means Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality. IEEE Transactions on Pattern Analysis and Machine Intelligence 6(1), 81–87 (1984)

    Article  MATH  Google Scholar 

  10. Bagirov, A.M.: Modified Global K-means Algorithm for Minimum Sum-of-squares Clustering Problems. Pattern Recognition 41(10), 3192–3199 (2008)

    Article  MATH  Google Scholar 

  11. Lee, W., Lee, S.S., An, D.-U.: Study of a Reasonable Initial Center Selection Method Applied to a K-Means Clustering. IEICE Transactions on Information and Systems 96(8), 1727–1733 (2013)

    Article  Google Scholar 

  12. Khan, F.: An Initial Seed Selection Algorithm for K-means Clustering of Georeferenced Data to Improve Replicability of Cluster Assignments for Mapping Application. Applied Soft Computing 12(11), 3698–3700 (2012)

    Article  Google Scholar 

  13. Zhang, S., Chu, Y.: High-performance Computing of GPU CUDA (2009)

    Google Scholar 

  14. Ryoo, S., Rodrigues, C.I., et al.: Optimization Principles and Application Performance Evaluation of a Multithreaded GPU Using CUDA. In: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, pp. 73–82 (2008)

    Google Scholar 

  15. Wu, J., Hong, B.: An Efficient k-means Algorithm on CUDA. In: IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), pp. 1740–1749 (2011)

    Google Scholar 

  16. Bai, H.T., et al.: K-means on Commodity GPUs with CUDA. In: 2009 WRI World Congress on Computer Science and Information Engineering, pp. 651–655 (2009)

    Google Scholar 

  17. Li, Y., Zhao, K., Chu, X., Liu, J.: Speeding up K-Means Algorithm by GPUs. Journal of Computer and System Sciences 79(2), 216–229 (2013)

    Article  MathSciNet  Google Scholar 

  18. Kijsipongse, E.: Dynamic Load Balancing on GPU Clusters for Large-scale K-Means Clustering. In: 2012 International Joint Conference on. Computer Science and Software Engineering (JCSSE), pp. 346–350. IEEE (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Yan, B., Zhang, Y., Yang, Z., Su, H., Zheng, H. (2014). DVT-PKM: An Improved GPU Based Parallel K-Means Algorithm. In: Huang, DS., Jo, KH., Wang, L. (eds) Intelligent Computing Methodologies. ICIC 2014. Lecture Notes in Computer Science(), vol 8589. Springer, Cham. https://doi.org/10.1007/978-3-319-09339-0_60

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09339-0_60

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09338-3

  • Online ISBN: 978-3-319-09339-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics