Abstract
In the big data age, the traditional parallel collaborative filtering algorithm cannot meet the needs of data analysis in the efficiency and accuracy of data processing. Therefore, this paper improves the traditional parallel collaborative filtering algorithm, analyzes the execution flow of the collaborative filtering algorithm, discusses the shortcomings of the traditional parallel collaborative filtering algorithm, and then describes in detail the steps of improved the collaborative filtering algorithm from generating nodes scoring vectors, obtaining neighboring nodes and forming recommendation information. Finally, the improved parallel collaborative filtering algorithm is verified through three aspects of running time, speedup and recommended accuracy. Experimental results show that the improved parallel collaborative filtering algorithm proposed in this paper has better running efficiency and higher recommendation accuracy than traditional parallel algorithm based on co-occurrence matrix.
Similar content being viewed by others
References
Cui, J.: Parallelizing k-means with hadoop/mahout for big data analytics (2015). http://bura.brunel.ac.uk/statistics/buraStats/buraNews.html
Mackey, L., Talwalkar, A., Jordan, M.I.: Distributed matrix completion and robust factorization. JMLR 16, 913–960 (2015)
Shuai, Z., Tao, L., Jiao, X., et al.: Parallel TNN spectral clustering algorithm in CPU-GPU heterogeneous computing environment. J. Comput. Res, Dev (2015)
Gu, Y.Z., Qin, K., Chen, Y.X., et al.: Parallel spatiotemporal spectral clustering with massive trajectory data. In: ISPRS—International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-2/W7, pp. 1173–1180 (2017)
Langone, R., Van Barel, M., Suykens, J.: Entropy-based incomplete Cholesky decomposition for a scalable spectral clustering algorithm: computational studies and sensitivity analysis. Entropy 18(5), 182 (2016)
Wang, B., Zhang, L., Wu, C., et al.: Spectral clustering based on similarity and dissimilarity criterion. Pattern Anal. Appl. 12(9), 1–12 (2015)
Liu, W., Luo, X.: An approximate spectral clustering algorithm for facility location problem. ICIC Express Lett. 9(1), 237–242 (2015)
Zhang, L.S., Hou, L., Lei, D.J.: Spectral clustering algorithm based on Hadoop cloud platform research and implementation. In: International Conference on Advanced Materials and Computer Science (2016)
Li, J., Wei, W., Hu, X., et al.: Multi-gpu based parallel collaborative filtering recommendation algorithm. ICIC Express Lett. 9(4), 1143–1151 (2015)
Wang, Z., Liu, Y., Chiu, S.: An efficient parallel collaborative filtering algorithm on multi-GPU platform. J. Supercomput. 72(6), 2080–2094 (2016)
Wang, S., Sun, G.M., Zou, J.Z., et al.: Parallel collaborative filtering algorithm based on user recommended influence. Comput. Sci. 14(5), 28–31 (2017)
Petroni, F., Querzoni, L., Beraldi, R., et al. LCBM: statistics-based parallel collaborative filtering. Bus. Inf. Syst. 35(9), 172–184 (2015)
Su, H., Lin, X., Wang, C., et al.: Parallel Collaborative Filtering Recommendation Model Based on Two-Phase Similarity. Intelligent Computing Theories and Methodologies, pp. 1–6. Springer International Publishing, Cham (2015)
Yang, Y., Xue, F., Cai, Y., et al.: Spark-based parallel collaborative filtering recommendation algorithm. In: International Conference on Computer Engineering, Information Science & Application Technology (2017)
Li, F., Zhang, S., Ye, Y., et al.: GPUMF: a GPU-enpowered collaborative filtering algorithm through matrix factorization. In: International Conference on Service Science, pp. 88–92. IEEE (2016)
Zhu, X., Cai, Q., Bai, L., et al.: A parallel recommendation algorithm based on tagging and collaborative filtering. J. Geol. Soc. Jpn. 95(9), 277–295 (2015)
Karydi, E., Margaritis, K., Vainikko, E.: On the effect of data sparsity to the performance of a Collaborative Filtering algorithm on a GPU. Sonda List Studenata Stomatološkog Fakulteta Sveučilišta U Zagrebu, pp. 88–91 (2015)
Che, J., Xie, H.: Hierarchical collaborative filtering algorithm based on Spark. Appl. Electron. Tech. 34(7), 135–139 (2015)
Cai, R., Li, C.: Research on collaborative filtering algorithm based on MapReduce. In: International Symposium on Computational Intelligence and Design, pp. 370–374. IEEE (2017)
Tian, B.J., Pei-Pei, H.U., Xiao-Juan, D.U., et al.: Optimization of the collaborative filtering recommendation algorithm based on clustering under Hadoop. Comput. Eng. Sci. 15(7), 25–29 (2016)
Hewanadungodage, C., Xia, Y., Lee, J.J.: A GPU-oriented online recommendation algorithm for efficient processing of time-varying continuous data streams. Knowl. Inf. Syst. 39(8), 1–34 (2016)
Li, H., Li, K., An, J., et al.: MSGD: a novel matrix factorization approach for large-scale collaborative filtering recommender systems on GPUs. IEEE Trans. Parallel Distrib. Syst. 15(9), 1–3 (2017)
Meng, H., Zhen, L., Fang, W., et al.: An efficient collaborative filtering algorithm based on graph model and improved KNN. J. Comput. Res. Dev. 27(8), 38–39 (2017)
Kivelä, A.: Acoustics of the vocal tract: MR image segmentation for modelling. Master’s thesis (2015)
Ju, X., Chen, Q., Wang, Z., et al.: DCF: a dataflow-based collaborative filtering training algorithm. Int. J. Parallel Prog. 6, 1–13 (2017)
Su, H., Lin, X., Yan, B., et al.: The Collaborative Filtering Algorithm with Time Weight Based on MapReduce. In: International Conference on Big Data Computing and Communications, pp. 386–395. Springer International Publishing, Cham (2015)
Jain, A., Bhatnagar, V., Sharma, P.: Collaborative and clustering based strategy in big data. In: Collaborative Filtering Using Data Mining and Analysis (2017)
Huang, S., Ma, J., Cheng, P., et al.: A hybrid multigroup coclustering recommendation framework based on information fusion. ACM Trans. Intell. Syst. Technol. 6(2), 1–22 (2015)
Mashal, I., Alsaryrah, O., Chung, T.Y.: Testing and evaluating recommendation algorithms in internet of things. J. Ambient Intell. Hum. Comput. 7(6), 1–12 (2016)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhu, L., Li, H. & Feng, Y. Research on big data mining based on improved parallel collaborative filtering algorithm. Cluster Comput 22 (Suppl 2), 3595–3604 (2019). https://doi.org/10.1007/s10586-018-2209-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-018-2209-9