Skip to main content

GPU-Accelerated Restricted Boltzmann Machine for Collaborative Filtering

  • Conference paper
Algorithms and Architectures for Parallel Processing (ICA3PP 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7439))

Abstract

Collaborative Filtering (CF) is an important technique for recommendation systems which model and analyzes the preferences of customers for giving reasonable advices. Recently, many applications based on Restricted Boltzmann Machine (RBM) have been developed for a large variety of learning problems. RBM-based model for Collaborative Filtering (RBM-CF) is able to deal with large scale data sets and obtains good recommendation performance. However, the computation of RBM becomes problematic when using large number of hidden features to improve the recommendation accuracy. Although RBM has great potential for parallelism, it is still a challenge to develop a parallel implementation of RBM-CF on GPU, since the data sets for CF are always large and sparse. In this paper, we propose a parallel implementation of RBM-CF on GPU using CUDA. We first present how to transform the computation of RBM-CF into matrix-based operation on GPU, and three CUDA kernels for sparse matrix-matrix multiplication to further improve the computational efficiency of RBM-CF for modeling large scale and sparse data sets. Experimental results show that significant speedups are achieved by our parallel implementation on GPU.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Smolensky, P.: Information processing in dynamical systems: Foundations of harmony theory. Parallel Distributed Processing: Explorations in the Microstructure of Cognition 1, 194–281 (1986)

    Google Scholar 

  2. Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted boltzmann machines for collaborative filtering. In: Proceedings of the 24th International Conference on Machine Learning, pp. 791–798. ACM (2007)

    Google Scholar 

  3. Salakhutdinov, R., Hinton, G.: Deep boltzmann machines. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, vol. 5, pp. 448–455 (2009)

    Google Scholar 

  4. Ranzato, M., Szummer, M.: Semi-supervised learning of compact document representations with deep networks. In: Proceedings of the 25th International Conference on Machine Learning, pp. 792–799. ACM (2008)

    Google Scholar 

  5. Ly, D., Paprotski, V., Yen, D.: Neural networks on gpus: Restricted boltzmann machines. Tech. rep., Technical Report, Department of Electrical and Computer Engineering, University of Toronto (2008)

    Google Scholar 

  6. McAfee, L.: Design and analysis of blas, gpu, and sparse multithreaded acceleration methods for restricted b oltzmann machine training

    Google Scholar 

  7. Raina, R., Madhavan, A., Ng, A.: Large-scale deep unsupervised learning using graphics processors. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 873–880. ACM (2009)

    Google Scholar 

  8. Kim, S., McAfee, L., McMahon, P., Olukotun, K.: A highly scalable restricted boltzmann machine FPGA implementation. In: International Conference on Field Programmable Logic and Applications, FPL 2009, pp. 367–372. IEEE (2009)

    Google Scholar 

  9. Kim, S., McMahon, P., Olukotun, K.: A large-scale architecture for restricted boltzmann machines. In: 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 201–208. IEEE (2010)

    Google Scholar 

  10. Ly, D., Chow, P.: A high-performance FPGA architecture for restricted boltzmann machines. In: Proceeding of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, pp. 73–82. ACM (2009)

    Google Scholar 

  11. Ly, D., Chow, P.: A multi-fpga architecture for stochastic restricted boltzmann machines. In: International Conference on Field Programmable Logic and Applications, FPL 2009, pp. 168–173. IEEE (2009)

    Google Scholar 

  12. Le Ly, D., Chow, P.: High-performance reconfigurable hardware architecture for restricted boltzmann machines. IEEE Transactions on Neural Networks 21(11), 1780–1792 (2010)

    Article  Google Scholar 

  13. Lekakos, G., Giaglis, G.: Improving the prediction accuracy of recommendation algorithms. Approaches Anchored on Human Factors. Interacting with Computers 18(3), 410–431 (2006)

    Google Scholar 

  14. Roh, T., Oh, K., Han, I.: The collaborative filtering recommendation based on som cluster-indexing cbr. Expert Systems with Applications 25(3), 413–423 (2003)

    Article  Google Scholar 

  15. Shih, Y., Liu, D.: Product recommendation approaches: Collaborative filtering via customer lifetime value and customer demands. Expert Systems with Applications 35(1), 350–360 (2008)

    Article  Google Scholar 

  16. Hinton, G.: Training products of experts by minimizing contrastive divergence. Neural Computation 14(8), 1771–1800 (2002)

    Article  MATH  Google Scholar 

  17. Nvidia, C.: Compute unified device architecture programming guide, vol. 83, p. 129. NVIDIA, Santa Clara (2007)

    Google Scholar 

  18. Nvidia, C.: Cublas library, vol. 15. NVIDIA Corporation, Santa Clara (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cai, X., Xu, Z., Lai, G., Wu, C., Lin, X. (2012). GPU-Accelerated Restricted Boltzmann Machine for Collaborative Filtering. In: Xiang, Y., Stojmenovic, I., Apduhan, B.O., Wang, G., Nakano, K., Zomaya, A. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2012. Lecture Notes in Computer Science, vol 7439. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33078-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33078-0_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33077-3

  • Online ISBN: 978-3-642-33078-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics