Skip to main content
Log in

Research on big data mining based on improved parallel collaborative filtering algorithm

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

In the big data age, the traditional parallel collaborative filtering algorithm cannot meet the needs of data analysis in the efficiency and accuracy of data processing. Therefore, this paper improves the traditional parallel collaborative filtering algorithm, analyzes the execution flow of the collaborative filtering algorithm, discusses the shortcomings of the traditional parallel collaborative filtering algorithm, and then describes in detail the steps of improved the collaborative filtering algorithm from generating nodes scoring vectors, obtaining neighboring nodes and forming recommendation information. Finally, the improved parallel collaborative filtering algorithm is verified through three aspects of running time, speedup and recommended accuracy. Experimental results show that the improved parallel collaborative filtering algorithm proposed in this paper has better running efficiency and higher recommendation accuracy than traditional parallel algorithm based on co-occurrence matrix.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Cui, J.: Parallelizing k-means with hadoop/mahout for big data analytics (2015). http://bura.brunel.ac.uk/statistics/buraStats/buraNews.html

    Google Scholar 

  2. Mackey, L., Talwalkar, A., Jordan, M.I.: Distributed matrix completion and robust factorization. JMLR 16, 913–960 (2015)

    MathSciNet  MATH  Google Scholar 

  3. Shuai, Z., Tao, L., Jiao, X., et al.: Parallel TNN spectral clustering algorithm in CPU-GPU heterogeneous computing environment. J. Comput. Res, Dev (2015)

    Google Scholar 

  4. Gu, Y.Z., Qin, K., Chen, Y.X., et al.: Parallel spatiotemporal spectral clustering with massive trajectory data. In: ISPRS—International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W7, pp. 1173–1180 (2017)

  5. Langone, R., Van Barel, M., Suykens, J.: Entropy-based incomplete Cholesky decomposition for a scalable spectral clustering algorithm: computational studies and sensitivity analysis. Entropy 18(5), 182 (2016)

    Article  Google Scholar 

  6. Wang, B., Zhang, L., Wu, C., et al.: Spectral clustering based on similarity and dissimilarity criterion. Pattern Anal. Appl. 12(9), 1–12 (2015)

    Google Scholar 

  7. Liu, W., Luo, X.: An approximate spectral clustering algorithm for facility location problem. ICIC Express Lett. 9(1), 237–242 (2015)

    MathSciNet  Google Scholar 

  8. Zhang, L.S., Hou, L., Lei, D.J.: Spectral clustering algorithm based on Hadoop cloud platform research and implementation. In: International Conference on Advanced Materials and Computer Science (2016)

  9. Li, J., Wei, W., Hu, X., et al.: Multi-gpu based parallel collaborative filtering recommendation algorithm. ICIC Express Lett. 9(4), 1143–1151 (2015)

    Google Scholar 

  10. Wang, Z., Liu, Y., Chiu, S.: An efficient parallel collaborative filtering algorithm on multi-GPU platform. J. Supercomput. 72(6), 2080–2094 (2016)

    Article  Google Scholar 

  11. Wang, S., Sun, G.M., Zou, J.Z., et al.: Parallel collaborative filtering algorithm based on user recommended influence. Comput. Sci. 14(5), 28–31 (2017)

    Google Scholar 

  12. Petroni, F., Querzoni, L., Beraldi, R., et al. LCBM: statistics-based parallel collaborative filtering. Bus. Inf. Syst. 35(9), 172–184 (2015)

    Google Scholar 

  13. Su, H., Lin, X., Wang, C., et al.: Parallel Collaborative Filtering Recommendation Model Based on Two-Phase Similarity. Intelligent Computing Theories and Methodologies, pp. 1–6. Springer International Publishing, Cham (2015)

    Google Scholar 

  14. Yang, Y., Xue, F., Cai, Y., et al.: Spark-based parallel collaborative filtering recommendation algorithm. In: International Conference on Computer Engineering, Information Science & Application Technology (2017)

  15. Li, F., Zhang, S., Ye, Y., et al.: GPUMF: a GPU-enpowered collaborative filtering algorithm through matrix factorization. In: International Conference on Service Science, pp. 88–92. IEEE (2016)

  16. Zhu, X., Cai, Q., Bai, L., et al.: A parallel recommendation algorithm based on tagging and collaborative filtering. J. Geol. Soc. Jpn. 95(9), 277–295 (2015)

    Google Scholar 

  17. Karydi, E., Margaritis, K., Vainikko, E.: On the effect of data sparsity to the performance of a Collaborative Filtering algorithm on a GPU. Sonda List Studenata Stomatološkog Fakulteta Sveučilišta U Zagrebu, pp. 88–91 (2015)

  18. Che, J., Xie, H.: Hierarchical collaborative filtering algorithm based on Spark. Appl. Electron. Tech. 34(7), 135–139 (2015)

    Google Scholar 

  19. Cai, R., Li, C.: Research on collaborative filtering algorithm based on MapReduce. In: International Symposium on Computational Intelligence and Design, pp. 370–374. IEEE (2017)

  20. Tian, B.J., Pei-Pei, H.U., Xiao-Juan, D.U., et al.: Optimization of the collaborative filtering recommendation algorithm based on clustering under Hadoop. Comput. Eng. Sci. 15(7), 25–29 (2016)

    Google Scholar 

  21. Hewanadungodage, C., Xia, Y., Lee, J.J.: A GPU-oriented online recommendation algorithm for efficient processing of time-varying continuous data streams. Knowl. Inf. Syst. 39(8), 1–34 (2016)

    Google Scholar 

  22. Li, H., Li, K., An, J., et al.: MSGD: a novel matrix factorization approach for large-scale collaborative filtering recommender systems on GPUs. IEEE Trans. Parallel Distrib. Syst. 15(9), 1–3 (2017)

    Google Scholar 

  23. Meng, H., Zhen, L., Fang, W., et al.: An efficient collaborative filtering algorithm based on graph model and improved KNN. J. Comput. Res. Dev. 27(8), 38–39 (2017)

    Google Scholar 

  24. Kivelä, A.: Acoustics of the vocal tract: MR image segmentation for modelling. Master’s thesis (2015)

  25. Ju, X., Chen, Q., Wang, Z., et al.: DCF: a dataflow-based collaborative filtering training algorithm. Int. J. Parallel Prog. 6, 1–13 (2017)

    Google Scholar 

  26. Su, H., Lin, X., Yan, B., et al.: The Collaborative Filtering Algorithm with Time Weight Based on MapReduce. In: International Conference on Big Data Computing and Communications, pp. 386–395. Springer International Publishing, Cham (2015)

  27. Jain, A., Bhatnagar, V., Sharma, P.: Collaborative and clustering based strategy in big data. In: Collaborative Filtering Using Data Mining and Analysis (2017)

  28. Huang, S., Ma, J., Cheng, P., et al.: A hybrid multigroup coclustering recommendation framework based on information fusion. ACM Trans. Intell. Syst. Technol. 6(2), 1–22 (2015)

    Google Scholar 

  29. Mashal, I., Alsaryrah, O., Chung, T.Y.: Testing and evaluating recommendation algorithms in internet of things. J. Ambient Intell. Hum. Comput. 7(6), 1–12 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heng Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, L., Li, H. & Feng, Y. Research on big data mining based on improved parallel collaborative filtering algorithm. Cluster Comput 22 (Suppl 2), 3595–3604 (2019). https://doi.org/10.1007/s10586-018-2209-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-018-2209-9

Keywords

Navigation