Abstract
Recommender systems apply information filtering technologies to identify a set of items that could be of interest to a user. Collaborative filtering (CF) is one of the most well-known successful filtering techniques in recommender systems and has been widely applied. However the usual CF techniques face issues that limit their application, especially in dealing with highly sparse and large-scale data. For instance, CF algorithms using the k-Nearest Neighbor approach are very efficient in filtering interesting items to users but in the same time they require a very expensive computation and grow non-linearly with the number of users and items in a database. To address this scalability issues, some researchers propose to use clustering methods. K-means is among the well-known clustering algorithms but has the shortcomings of dependency on the number of the clusters and on the initial centroids, which lead to inaccurate recommendations and increase computation time. In this paper, we will show by comparing with K-means based approaches how a clustering algorithm called K-means+ that considers the statistical nature of data can improve the performances of recommendation with reasonable computation time. The results presented that predictions of substantially better quality are obtained with the proposed K-means+ method. These results also provide significant evidences that the proposed Splitting-Merging clustering based CF is more scalable than the conventional one.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bobadilla, J., Ortega, F., Hernando, A., GutiéRrez, A.: Recommender systems survey. Knowl.-Based Syst. 46, 109–132 (2013)
Breese, J.S., Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pp. 43–52 (1998)
Deshpande, M., Karypis, G.: Item-based top-N recommendation algorithms. ACM Trans. Inf. Syst. (TOIS) 22, 143–177 (2004)
Burke, R.: Hybrid recommender systems: survey and experiments. User Model. User-Adapt. Interact. 12, 331–370 (2002)
Pazzani, M.J., Billsus, D.: Content-based recommendation systems. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) The Adaptive Web. LNCS, vol. 4321, pp. 325–341. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72079-9_10
Lu, J., Wu, D., Mao, M., Wang, W., Zhang, G.: Recommender system application developments: a survey. Decis. Support. Syst. 74, 12–32 (2015)
Su, X., Khoshgoftaar, T.M.: A survey of collaborative filtering techniques. Adv. Artif. Intell. 2009, 4 (2009)
Shi, Y., Larson, M., Hanjalic, A.: Collaborative filtering beyond the user-item matrix: a survey of the state of the art and future challenges. ACM Comput. Surv. (CSUR) 47, 1–45 (2014)
Polatidis, N., Georgiadis, C.K.: A multi-level collaborative filtering method that improves recommendations. Expert Syst. Appl. 48, 100–110 (2016)
Linden, G., Smith, B., York, J.: Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Comput. 7, 76–80 (2003)
Zhang, W.: Research on application of collaborative filtering in electronic commerce recommender systems. In: Lin, S., Huang, X. (eds.) CSEE 2011. CCIS, vol. 215, pp. 539–544. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23324-1_87
Huang, Z., Zeng, D., Chen, H.: A comparative study of recommendation algorithms in e-commerce applications. EEE Intell. Syst. 22(5), 68–78 (2007)
Bobadilla, J., Serradilla, F., Hernando, A.: Collaborative filtering adapted to recommender systems of e-learning. Knowl.-Based Syst. 22, 261–265 (2009)
Shambour, Q., Lu, J.: A hybrid trust-enhanced collaborative filtering recommendation approach for personalized government-to-business e-services. Int. J. Intell. Syst. 26, 814–843 (2011)
Zhang, Y., Chen, W., Yin, Z.: Collaborative filtering with social regularization for TV program recommendation. Knowl.-Based Syst. 54, 310–317 (2013)
Winoto, P., Tang, T.Y.: The role of user mood in movie recommendations. Expert Syst. Appl. 37, 6086–6092 (2010)
Cohen, W.W., Fan, W.: Web-collaborative filtering: recommending music by crawling the web. Comput. Netw. 33, 685–698 (2000)
Benkoussas, C., Hamdan, H., Albitar, S., Ollagnier, A., Bellot, P.: Collaborative filtering for book recommandation. In: Working Notes for CLEF 2014 Conference, Sheeld, UK, 15–18 September 2014, pp. 501–507 (2014)
Singh, A., Sharma, A., Dey, N., Ashour, A.S.: Web recommendation techniques: status, issues and challenges. J. Netw. Commun. Emerg. Technol. 5, 57–65 (2015)
Chen, K., Chen, T., Zheng, G., Jin, O., Yao, E., Yu, Y.: Collaborative personalized tweet recommendation. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 661–670. ACM (2012)
Herlocker, J.L., Konstan, J.A., Borchers, A., Riedj, J.: An algorithmic framework for performing collaborative filtering. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1999, pp. 230–237. ACM, New York (1999)
Zahra, S., Ghazanfar, M.A., Khalid, A., Azam, M.A., Naeem, U., Prugel-Bennett, A.: Novel centroid selection approaches for KMeans-clustering based recommender systems. Inf. Sci. 320, 156–189 (2015)
Gong, S., Ye, H., Tan, H.: Combining memory-based and model-based collaborative filtering in recommender system. In: Pacific-Asia Conference on Circuits, Communications and Systems, PACCS 2009, pp. 690–693. IEEE (2009)
Su, X., Khoshgoftaar, T.M.: Collaborative filtering for multi-class data using belief nets algorithms. In: 18th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2006, pp. 497–504. IEEE (2006)
Bokde, D., Girase, S., Mukhopadhyay, D.: Matrix factorization model in collaborative filtering algorithms: a survey. Procedia Comput. Sci. 49, 136–146 (2015). Proceedings of 4th International Conference on Advances in Computing, Communication and Control (ICAC3 2015)
Zhang, Z., Liu, H.: Application and research of improved probability matrix factorization techniques in collaborative filtering. Int. J. Control Autom. 7, 79–92 (2014)
Hofmann, T., Puzicha, J.: Latent class models for collaborative filtering. In: IJCAI, vol. 99, pp. 688–693 (1999)
Roh, T.H., Oh, K.J., Han, I.: The collaborative filtering recommendation based on SOM cluster-indexing CBR. Expert Syst. Appl. 25, 413–423 (2003)
Feng, Z., Huiyou, C.: Employing BP neural networks to alleviate the sparsity issue in collaborative filtering recommendation algorithms. J. Comput. Res. Dev. 4, 014 (2006)
Salah, A., Rogovschi, N., Nadif, M.: A dynamic collaborative filtering system via a weighted clustering approach. Neurocomputing 175, 206–215 (2016)
Ungar, L.H., Foster, D.P.: Clustering methods for collaborative filtering. In: AAAI Workshop on Recommendation Systems, vol. 1, pp. 114–129 (1998)
Guan, Y., Ghorbani, A.A., Belacel, N.: Y-means: a clustering method for intrusion detection. In: Canadian Conference on Electrical and Computer Engineering, IEEE CCECE 2003, vol. 2, pp. 1083–1086. IEEE (2003)
Guan, Y., Ghorbani, A.A., Belacel, N.: Y-means: a clustering method for intrusion detection. In: Canadian Conference on Electrical and Computer Engineering, IEEE CCECE 2003, vol. 2, pp. 1083–1086. IEEE (2003)
Harper, F.M., Konstan, J.A.: The movielens datasets: history and context. ACM Trans. Interact. Intell. Syst. 5, 19:1–19:19 (2015)
Jawaheer, G., Szomszor, M., Kostkova, P.: Comparison of implicit and explicit feedback from an online music recommendation service. In: Proceedings of the 1st International Workshop on Information Heterogeneity and Fusion in Recommender Systems, pp. 47–51. ACM (2010)
Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J.: GroupLens: an open architecture for collaborative filtering of netnews. In: Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work, pp. 175-186. ACM (1994)
Konstan, J., Miller, B., Maltz, D., Herlocker, J., Gordon, L., Riedl, J.: GroupLens: applying collaborative filtering to usenet news. Commun. ACM 40, 77–87 (1997)
Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17, 734–749 (2005)
Hill, W., Stead, L., Rosenstein, M., Furnas, G.: Recommending and evaluating choices in a virtual community of use. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 194–201. ACM Press/Addison-Wesley Publishing Co. (1995)
Herlocker, J., Konstan, J.A., Riedl, J.: An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Inf. Retr. 5, 287–310 (2002)
Shardanand, U., Maes, P.: Social information filtering: algorithms for automating word of mouth. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 210–217. ACM Press/Addison-Wesley Publishing Co. (1995)
Al-Shamri, M.Y.H.: Power coefficient as a similarity measure for memory-based collaborative recommender systems. Expert Syst. Appl. 41, 5680–5688 (2014)
Liu, H., Hu, Z., Mian, A., Tian, H., Zhu, X.: A new user similarity model to improve the accuracy of collaborative filtering. Knowl.-Based Syst. 56, 156–166 (2014)
Ekstrand, M.D., Riedl, J.T., Konstan, J.A.: Collaborative filtering recommender systems. Found. Trends Hum.-Comput. Interact. 4, 81–173 (2011)
Ekstrand, M.D., Ludwig, M., Konstan, J.A., Riedl, J.T.: Rethinking the recommender research ecosystem: reproducibility, openness, and lenskit. In: Proceedings of the fifth ACM Conference on Recommender systems, pp. 133–140. ACM (2011)
Darvishi-Mirshekarlou, F., Akbarpour, S., Feizi-Derakhshi, M., et al.: Reviewing cluster based collaborative filtering approaches. Int. J. Comput. Appl. Technol. Res. 2, 650–659 (2013)
Huang, C., Yin, J.: Effective association clusters filtering to cold-start recommendations. In: 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), vol. 5, pp. 2461–2464. IEEE (2010)
Birtolo, C., Ronca, D., Armenise, R., Ascione, M.: Personalized suggestions by means of collaborative filtering: a comparison of two different model-based techniques. In: 2011 Third World Congress on Nature and Biologically Inspired Computing (NaBIC), pp. 444–450 (2011)
Birtolo, C., Ronca, D.: Advances in clustering collaborative filtering by means of fuzzy c-means and trust. Expert Syst. Appl. 40, 6997–7009 (2013)
Koren, Y.: Factor in the neighbors: scalable and accurate collaborative filtering. ACM Trans. Knowl. Discov. Data 4, 1–24 (2010)
Salakhutdinov, R., Mnih, A., Hinton, G.: Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the 24th International Conference on Machine Learning, ICML 2007, pp. 791–798. ACM, New York (2007)
Wilson, J., Chaudhury, S., Lall, B.: Improving collaborative filtering based recommenders using topic modelling. In: Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)-Volume 01, pp. 340–346. IEEE Computer Society (2014)
Sahoo, N., Singh, P.V., Mukhopadhyay, T.: A hidden Markov model for collaborative filtering. MIS Q. 36, 1329–1356 (2012)
Durand, G., Laplante, F., Kop, R.: A learning design recommendation system based on Markov decision processes. In: ACM SIG KDD 2011 Workshop: Knowledge Discovery in Educational Data (2011)
Belacel, N., Hansen, P., Mladenovic, N.: Fuzzy J-means: a new heuristic for fuzzy clustering. Pattern Recognit. 35, 2193–2200 (2002)
Belacel, N., Wang, C., Cupelovic-Culf, M.: Clustering: Unsupervised Learning in Large Biological Data, pp. 89–127. Wiley, Hoboken (2010)
LaPlante, F., Kardouchi, M., Belacel, N.: Image categorization using a heuristic automatic clustering method based on hierarchical clustering. In: Kamel, M., Campilho, A. (eds.) ICIAR 2015. LNCS, vol. 9164, pp. 150–158. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20801-5_16
Wu, W., Xiong, H., Shekhar, S.: Clustering and Information Retrieval, vol. 11. Springer, Heidelberg (2013). https://doi.org/10.1007/978-1-4613-0227-8
Zhang, C.X., Zhang, Z.K., Yu, L., Liu, C., Liu, H., Yan, X.Y.: Information filtering via collaborative user clustering modeling. Phys. A: Stat. Mech. Appl. 396, 195–203 (2014)
Tsai, C.F., Hung, C.: Cluster ensembles in collaborative filtering recommendation. Appl. Soft Comput. 12, 1417–1425 (2012)
Sarwar, B.M., Karypis, G., Konstan, J., Riedl, J.: Recommender systems for large-scale e-commerce: Scalable neighborhood formation using clustering. In: Proceedings of the fifth international conference on computer and information technology, vol. 1, pp. 1–5 (2002)
Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. PAMI 6, 721–741 (1984)
Kohrs, A., Merialdo, B.: Clustering for collaborative filtering applications. Intell. Image Process. Data Anal. Inf. Retr. 3, 199–205 (1999)
Xue, G.R., et al.: Scalable collaborative filtering using cluster-based smoothing. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2005, pp. 114–121. ACM, New York (2005)
Brown, P.F., deSouza, P.V., Mercer, R.L., Pietra, V.J.D., Lai, J.C.: Class-based n-gram models of natural language. Comput. Linguist. 18, 467–479 (1992)
Ma, X., Lu, H., Gan, Z., Zhao, Q.: An exploration of improving prediction accuracy by constructing a multi-type clustering based recommendation framework. Neurocomputing 191, 388–397 (2016)
Hu, R., Dou, W., Liu, J.: Clustering-based collaborative filtering approach for mashups recommendation over big data. In: 2013 IEEE 16th International Conference on Computational Science and Engineering (CSE), pp. 810–817 (2013)
Dakhel, G., Mahdavi, M.: A new collaborative filtering algorithm using k-means clustering and neighbors’ voting. In: 2011 11th International Conference on Hybrid Intelligent Systems (HIS), pp. 179–184 (2011)
Pereira, A.L.V., Hruschka, E.R.: Simultaneous co-clustering and learning to address the cold start problem in recommender systems. Knowl.-Based Syst. 82, 11–19 (2015)
Huang, H., et al.: K-means+ method for improving gene selection for classification of microarray data. In: Computational Systems Bioinformatics Conference, pp. 110–111. IEEE (2005)
Hansen, P., Mladenovic, N.: J-means: a new local search heuristic for minimum sum of squares clustering. Pattern Recognit. 34, 405–413 (2001)
Cremonesi, P., Turrin, R., Lentini, E., Matteucci, M.: An evaluation methodology for collaborative recommender systems. In: International Conference on Automated solutions for Cross Media Content and Multi-channel Distribution, AXMEDIS 2008, pp. 224–231 (2008)
Sarwar, B.M., Karypis, G., Konstan, J.A., Riedl, J.T.: Application of dimensionality reduction in recommender system-a case study. In: ACM WEBKDD Workshop, pp. 1–12 (2000)
Herlocker, J.L., Konstan, J.A., Terveen, L.G., Riedl, J.T.: Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. (TOIS) 22, 5–53 (2004)
Belacel, N., Durand, G., Leger, S., Bouchard, C.: Splitting-merging clustering algorithm for collaborative filtering recommendation system. In: Proceedings of the 10th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, INSTICC, pp. 165–174. SciTePress (2018)
Wang, C., Belacel, N.: VNSOptClust: a variable neighborhood search based approach for unsupervised anomaly detection. In: Le Thi, H.A., Bouvry, P., Pham Dinh, T. (eds.) MCO 2008. CCIS, vol. 14, pp. 607–616. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87477-5_64
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Her Majesty the Queen in Right of Canada as represented by NRC Canada
About this paper
Cite this paper
Belacel, N., Durand, G., Leger, S., Bouchard, C. (2019). Scalable Collaborative Filtering Based on Splitting-Merging Clustering Algorithm. In: van den Herik, J., Rocha, A. (eds) Agents and Artificial Intelligence. ICAART 2018. Lecture Notes in Computer Science(), vol 11352. Springer, Cham. https://doi.org/10.1007/978-3-030-05453-3_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-05453-3_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05452-6
Online ISBN: 978-3-030-05453-3
eBook Packages: Computer ScienceComputer Science (R0)