GPU-Accelerated Restricted Boltzmann Machine for Collaborative Filtering

Cai, Xianggao; Xu, Zhanpeng; Lai, Guoming; Wu, Chengwei; Lin, Xiaola

doi:10.1007/978-3-642-33078-0_22

Xianggao Cai²²,
Zhanpeng Xu²²,
Guoming Lai²³,
Chengwei Wu²² &
…
Xiaola Lin²²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7439))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

2176 Accesses
3 Citations

Abstract

Collaborative Filtering (CF) is an important technique for recommendation systems which model and analyzes the preferences of customers for giving reasonable advices. Recently, many applications based on Restricted Boltzmann Machine (RBM) have been developed for a large variety of learning problems. RBM-based model for Collaborative Filtering (RBM-CF) is able to deal with large scale data sets and obtains good recommendation performance. However, the computation of RBM becomes problematic when using large number of hidden features to improve the recommendation accuracy. Although RBM has great potential for parallelism, it is still a challenge to develop a parallel implementation of RBM-CF on GPU, since the data sets for CF are always large and sparse. In this paper, we propose a parallel implementation of RBM-CF on GPU using CUDA. We first present how to transform the computation of RBM-CF into matrix-based operation on GPU, and three CUDA kernels for sparse matrix-matrix multiplication to further improve the computational efficiency of RBM-CF for modeling large scale and sparse data sets. Experimental results show that significant speedups are achieved by our parallel implementation on GPU.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Smolensky, P.: Information processing in dynamical systems: Foundations of harmony theory. Parallel Distributed Processing: Explorations in the Microstructure of Cognition 1, 194–281 (1986)
Google Scholar
Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted boltzmann machines for collaborative filtering. In: Proceedings of the 24th International Conference on Machine Learning, pp. 791–798. ACM (2007)
Google Scholar
Salakhutdinov, R., Hinton, G.: Deep boltzmann machines. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, vol. 5, pp. 448–455 (2009)
Google Scholar
Ranzato, M., Szummer, M.: Semi-supervised learning of compact document representations with deep networks. In: Proceedings of the 25th International Conference on Machine Learning, pp. 792–799. ACM (2008)
Google Scholar
Ly, D., Paprotski, V., Yen, D.: Neural networks on gpus: Restricted boltzmann machines. Tech. rep., Technical Report, Department of Electrical and Computer Engineering, University of Toronto (2008)
Google Scholar
McAfee, L.: Design and analysis of blas, gpu, and sparse multithreaded acceleration methods for restricted b oltzmann machine training
Google Scholar
Raina, R., Madhavan, A., Ng, A.: Large-scale deep unsupervised learning using graphics processors. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 873–880. ACM (2009)
Google Scholar
Kim, S., McAfee, L., McMahon, P., Olukotun, K.: A highly scalable restricted boltzmann machine FPGA implementation. In: International Conference on Field Programmable Logic and Applications, FPL 2009, pp. 367–372. IEEE (2009)
Google Scholar
Kim, S., McMahon, P., Olukotun, K.: A large-scale architecture for restricted boltzmann machines. In: 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 201–208. IEEE (2010)
Google Scholar
Ly, D., Chow, P.: A high-performance FPGA architecture for restricted boltzmann machines. In: Proceeding of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, pp. 73–82. ACM (2009)
Google Scholar
Ly, D., Chow, P.: A multi-fpga architecture for stochastic restricted boltzmann machines. In: International Conference on Field Programmable Logic and Applications, FPL 2009, pp. 168–173. IEEE (2009)
Google Scholar
Le Ly, D., Chow, P.: High-performance reconfigurable hardware architecture for restricted boltzmann machines. IEEE Transactions on Neural Networks 21(11), 1780–1792 (2010)
Article Google Scholar
Lekakos, G., Giaglis, G.: Improving the prediction accuracy of recommendation algorithms. Approaches Anchored on Human Factors. Interacting with Computers 18(3), 410–431 (2006)
Google Scholar
Roh, T., Oh, K., Han, I.: The collaborative filtering recommendation based on som cluster-indexing cbr. Expert Systems with Applications 25(3), 413–423 (2003)
Article Google Scholar
Shih, Y., Liu, D.: Product recommendation approaches: Collaborative filtering via customer lifetime value and customer demands. Expert Systems with Applications 35(1), 350–360 (2008)
Article Google Scholar
Hinton, G.: Training products of experts by minimizing contrastive divergence. Neural Computation 14(8), 1771–1800 (2002)
Article MATH Google Scholar
Nvidia, C.: Compute unified device architecture programming guide, vol. 83, p. 129. NVIDIA, Santa Clara (2007)
Google Scholar
Nvidia, C.: Cublas library, vol. 15. NVIDIA Corporation, Santa Clara (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Science and Technology, Sun Yat-sen University, Guangzhou, 510275, China
Xianggao Cai, Zhanpeng Xu, Chengwei Wu & Xiaola Lin
Department of Computer Application and Technology, Hanshan Normal University, Chaozhou, 521041, China
Guoming Lai

Authors

Xianggao Cai
View author publications
You can also search for this author in PubMed Google Scholar
Zhanpeng Xu
View author publications
You can also search for this author in PubMed Google Scholar
Guoming Lai
View author publications
You can also search for this author in PubMed Google Scholar
Chengwei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaola Lin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology, Deakin University, Melbourne Burwood Campus, 221 Burwood Highway, 3125, Burwood, VIC, Australia
Yang Xiang
SEECS, University of Ottawa, 8, King Edward Ave, K1N 6N5, Ottawa, ON, Canada
Ivan Stojmenovic
Department of Intelligent Informatics, Kyushu Sangyo University, 2-3-1 Matsukadai, Higashi-ku, 813-8503, Fukuoka, Japan
Bernady O. Apduhan
School of Information Science and Engineering, Central South University, 410083, Changsha, Hunan Province, P.R. China
Guojun Wang
Department of Information Engineering, Hiroshima University, 1-4-1, Kagamiyama, 739-8527, Higashi-Hiroshima, Japan
Koji Nakano
School of Information Technologies, University of Sydney, Building J12, 2006, Sydney, NSW, Australia
Albert Zomaya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cai, X., Xu, Z., Lai, G., Wu, C., Lin, X. (2012). GPU-Accelerated Restricted Boltzmann Machine for Collaborative Filtering. In: Xiang, Y., Stojmenovic, I., Apduhan, B.O., Wang, G., Nakano, K., Zomaya, A. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2012. Lecture Notes in Computer Science, vol 7439. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33078-0_22

Download citation

DOI: https://doi.org/10.1007/978-3-642-33078-0_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33077-3
Online ISBN: 978-3-642-33078-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics