Skip to main content

Mapping Data Mining Algorithms on a GPU Architecture: A Study

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6804))

Abstract

Data mining algorithms are designed to extract information from a huge amount of data in an automatic way. The datasets that can be analysed with these techniques are gathered from a variety of domains, from business related fields to HPC and supercomputers. The datasets continue to increase at an exponential rate, so research has been focusing on parallelizing different data mining techniques. Recently, GPU hybrid architectures are starting to be used for this task. However the data transfer rate between CPU and GPU is a bottleneck for the applications dealing with large data entries exhibiting numerous dependencies. In this paper we analyse how efficient data mining algorithms can be mapped on these architectures by extracting the common characteristics of these methods and by looking at the communication patterns between the main memory and the GPU’s shared memory. We propose an experimental study for the performance of memory systems on GPU architectures when dealing with data mining algorithms and we also advance performance model guidelines based on the observations.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: International Conference on Very Large Data Bases, pp. 487–499 (1994)

    Google Scholar 

  2. Han, J., et al.: Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery 8(1) (2004)

    Google Scholar 

  3. Fang, W., et al.: Wenbin Fang and all: Frequent Itemset Mining on Graphics Processors (2009)

    Google Scholar 

  4. Liu, L., et al.: Optimization of Frequent Itemset Mining on Multiple-Core Processor. In: International Conference on Very Large Data Bases, pp. 1275–1285 (2007)

    Google Scholar 

  5. Shalom, A., et al.: Efficient k-means clustering using accelerated graphics processors. In: International Conference on Data Warehousing and Knowledge Discovery, pp. 166–175 (2008)

    Google Scholar 

  6. Cao, F., Tung, A.K.H., Zhou, A.: Scalable clustering using graphics processors. In: Yu, J.X., Kitsuregawa, M., Leong, H.-V. (eds.) WAIM 2006. LNCS, vol. 4016, pp. 372–384. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  7. Liao, Q., et al.: Accelerated Support Vector Machines for Mining High-Throughput Screening Data. J. Chem. Inf. Model. 49(12), 2718–2725 (2009)

    Article  Google Scholar 

  8. Wu, X., et al.: Top 10 algorithms in data mining. Knowledge and Information Systems 14(1) (2007)

    Google Scholar 

  9. Lastra, A., Lin, M., Manocha, D.: Gpgp: General purpose computation using graphics processors. In: ACM Workshop on General Purpose Computing on Graphics Processors (2004)

    Google Scholar 

  10. Li, J., et al.: Parallel Data Mining Algorithms for Association Rules and Clustering. In: International Conference on Management of Data (2008)

    Google Scholar 

  11. Carpenter, A.: CuSVM A cuda implementation of support vector classification and regression (2009), http://patternsonascreen.net/cuSVM.html

  12. Pramudiono, I., et al.: Tree structure based parallel frequent pattern mining on PC cluster. In: International Conference on Database and Expert Systems Applications, pp. 537–547 (2003)

    Google Scholar 

  13. Pramudiono, I., Kitsuregawa, M.: Tree structure based parallel frequent pattern mining on PC cluster. In: Mařík, V., Štěpánková, O., Retschitzegger, W. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 537–547. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  14. Garcia, V., et al.: Fast k nearest neighbor search using GPU. In: Computer Vision and Pattern Recognition Workshops (2008)

    Google Scholar 

  15. Oh, K.-S., et al.: GPU implementation of neural networks. Journal of Pattern Recognition 37(6) (2004)

    Google Scholar 

  16. Domeniconi, C., et al.: An Efficient Density-based Approach for Data Mining Tasks. Journal of Knowledge and Information Systems 6(6) (2004)

    Google Scholar 

  17. Domeniconi, C., et al.: OpenMP to GPGPU: a compiler framework for automatic translation and optimization. In: Symposium on Principles and Practice of Parallel Programming, pp. 101–110 (2009)

    Google Scholar 

  18. Wang, Q.: Divergence estimation of continuous distributions based on data-dependent partitions. IEEE Transactions on Information Theory, 3064–3074 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gainaru, A., Slusanschi, E., Trausan-Matu, S. (2011). Mapping Data Mining Algorithms on a GPU Architecture: A Study. In: Kryszkiewicz, M., Rybinski, H., Skowron, A., Raś, Z.W. (eds) Foundations of Intelligent Systems. ISMIS 2011. Lecture Notes in Computer Science(), vol 6804. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21916-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21916-0_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21915-3

  • Online ISBN: 978-3-642-21916-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics