GLDA: Parallel Gibbs Sampling for Latent Dirichlet Allocation on GPU

Xue, Pei; Li, Tao; Zhao, Kezhao; Dong, Qiankun; Ma, Wenjing

doi:10.1007/978-981-10-2209-8_9

GLDA: Parallel Gibbs Sampling for Latent Dirichlet Allocation on GPU

Pei Xue¹²,
Tao Li^12,13,
Kezhao Zhao¹²,
Qiankun Dong¹² &
…
Wenjing Ma¹⁴

Conference paper
First Online: 09 August 2016

1079 Accesses
2 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 626))

Abstract

With the development of the general computing ability of GPU, more and more algorithms are being run on GPU, to enjoy much higher speed. In this paper, we propose an approach that uniformly accelerate Gibbs Sampling for LDA (Latent Dirichlet Allocation) algorithm on GPU, which makes the data load to the cores of GPU evenly to avoid the idle waiting for GPU, and improves the utilization of GPU. We use three text mining datasets to test the algorithm. Experiments show that our parallel methods can achieve about 30x speedup over sequential training methods with similar prediction precision. Furthermore, the idea that uniformly partitioning the data bases on GPU can also be applied to other machine learning algorithms.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Nvidia cuda. http://www.nvidia.com/cuda
Aila, T., Laine, S.: Understanding the efficiency of ray traversal on GPUs. In: Proceedings of the Conference on High Performance Graphics 2009, pp. 145–149. ACM (2009)
Google Scholar
Blei, D.M.: Introduction to probabilistic topicmodels. http://www.cs.princeton.edu/blei/papers/Blei2011.pdf
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Chen, W.Y., Chu, J.C., Luan, J., Bai, H., Wang, Y., Chang, E.Y.: Collaborative filtering for orkut communities: discovery of user latent behavior. In: Proceedings of the 18th international conference on World wide web, pp. 681–690. ACM (2009)
Google Scholar
Cook, S.: CUDA programming: a developer’s guide to parallel computing with GPUs. Newnes (2012)
Google Scholar
Wu, E., Liu, Y.: General calculation based on graphics processing unit (in Chinese). J. Comput. Aided Des. Comput. Graph. 16(5), 601–612 (2004)
Google Scholar
Zhang, H., Li, L., Lan, L.: Research on the application of the general calculation of GPU (in Chinese). Comput. Digit. Eng. 33(12), 60–62 (2005)
Google Scholar
Leischner, N., Osipov, V., Sanders, P.: GPU sample sort. In: 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp. 1–10. IEEE (2010)
Google Scholar
Li, T., Liu, X., Dong, Q., Ma, W., Wang, K.: HPSVM: Heterogeneous parallel SVM with factorization based ipm algorithm on CPU-GPU cluster. In: 2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP), pp. 74–81. IEEE (2016)
Google Scholar
Li, T., Wang, D., Zhang, S., Yang, Y.: Parallel rank coherence in networks for inferring disease phenotype and gene set associations. In: Wu, J., Chen, H., Wang, X. (eds.) ACA 2014. CCIS, vol. 451, pp. 163–176. Springer, Heidelberg (2014)
Google Scholar
Liu, X., Zeng, J., Yang, X., Yan, J., Yang, Q.: Scalable parallel em algorithms for latent dirichlet allocation in multi-core systems. In: Proceedings of the 24th International Conference on World Wide Web, pp. 669–679. International World Wide Web Conferences Steering Committee (2015)
Google Scholar
Liu, Z., Zhang, Y., Chang, E.Y., Sun, M.: Plda+: parallel latent dirichlet allocation with data placement and pipeline processing. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 26 (2011)
Google Scholar
Masada, T., Hamada, T., Shibata, Y., Oguri, K.: Accelerating collapsed variational Bayesian inference for latent dirichlet allocation with nvidia CUDA compatible devices. In: Chien, B.C., Hong, T.P., Chen, S.M., Ali, M. (eds.) IEA/AIE 2009. LNCS, vol. 5579, pp. 491–500. Springer, Heidelberg (2009)
Chapter Google Scholar
Nallapati, R.M., Ahmed, A., Xing, E.P., Cohen, W.W.: Joint latent topic models for text and citations. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 542–550. ACM (2008)
Google Scholar
Newman, D., Smyth, P., Welling, M., Asuncion, A.U.: Distributed inference for latent dirichlet allocation. In: Advances in Neural Information Processing Systems, pp. 1081–1088 (2007)
Google Scholar
Smyth, P., Welling, M., Asuncion, A.U.: Asynchronous distributed learning of topic models. In: Advances in Neural Information Processing Systems. pp. 81–88 (2009)
Google Scholar
Tang, J., Huo, R., Yao, J.: Evaluation of stability and similarity of latent dirichlet allocation. In: Software Engineering (WCSE), 2013 Fourth World Congress on. pp. 78–83. IEEE (2013)
Google Scholar
Tora, S., Eguchi, K.: Mpi/openmp hybrid parallel inference for latent dirichlet allocation. In: Proceedings of the Third Workshop on Large Scale Data Mining: Theory and Applications. pp. 5. ACM (2011)
Google Scholar
Wang, Y., Bai, H., Stanton, M., Chen, W.Y., Chang, E.Y.: PLDA: Parallel Latent Dirichlet Allocation for Large-Scale Applications. In: Goldberg, A.V., Zhou, Y. (eds.) AAIM 2009. LNCS, vol. 5564, pp. 301–314. Springer, Heidelberg (2009)
Chapter Google Scholar
Yan, F., Xu, N., Qi, Y.: Parallel inference for latent dirichlet allocation on graphics processing units. In: Advances in Neural Information Processing Systems. pp. 2134–2142 (2009)
Google Scholar
Yan, J.F., Zeng, J., Gao, Y., Liu, Z.Q.: Communication-efficient algorithms for parallel latent dirichlet allocation. Soft Computing 19(1), 3–11 (2015)
Article Google Scholar
Zhang, S., Li, T., Dong, Q., Liu, X., Yang, Y.: Cpu-assisted gpu thread pool model for dynamic task parallelism. In: Networking, Architecture and Storage (NAS), 2015 IEEE International Conference on. pp. 135–140. IEEE (2015)
Google Scholar

Download references

Acknowledgments

This work is supported by the natural science fund of Tianjin City No. 16JCYBJC15200, the Open Project Fund of State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences No. CARCH201504, the special Research Fund for the Doctoral program of Higher Education No. 20130031120029, and the Open Fund of provincial and ministerial level scientific research institutions, Civil Aviation University of China No. CAAC-ISECCA-201502.

Author information

Authors and Affiliations

College of Computer and Control Engineering, Nankai University, Tianjin, 300071, China
Pei Xue, Tao Li, Kezhao Zhao & Qiankun Dong
State Key Lab. of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Tao Li
Laboratory of Parallel Software and Computational Science, State Key Laboratory of Computing Science, Institute of Software, Chinese Academy of Sciences, Beijing, China
Wenjing Ma

Authors

Pei Xue
View author publications
You can also search for this author in PubMed Google Scholar
Tao Li
View author publications
You can also search for this author in PubMed Google Scholar
Kezhao Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Qiankun Dong
View author publications
You can also search for this author in PubMed Google Scholar
Wenjing Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Li .

Editor information

Editors and Affiliations

Department of Computer Science and Technology, National University of Defense Technology, Changsha, China
Junjie Wu
State Key Laboratory of Computer Architecture, Chinese Academy of Sciences, Beijing, China
Lian Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xue, P., Li, T., Zhao, K., Dong, Q., Ma, W. (2016). GLDA: Parallel Gibbs Sampling for Latent Dirichlet Allocation on GPU. In: Wu, J., Li, L. (eds) Advanced Computer Architecture. ACA 2016. Communications in Computer and Information Science, vol 626. Springer, Singapore. https://doi.org/10.1007/978-981-10-2209-8_9

Download citation

DOI: https://doi.org/10.1007/978-981-10-2209-8_9
Published: 09 August 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-2208-1
Online ISBN: 978-981-10-2209-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)