Skip to main content
Log in

Recommendation using a clustering algorithm based on a hybrid features selection method

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

The cold start problem is a potentiel problem in Recommender Systems (RSs). It concerns the inability of the system to infer recommendaation for new users or new items about wich it has not enough iformation. Specifically, when an item is new, the system may fail to perform well due to the insufficiency of available information for this item. The most common solution addressed in the literature consists in combining the content and collaborative information under a single RS. However these hybrid solutions inherit the classical problems of natural language ambiguity and don’t exploit semantic knowledge in their items representations. In this paper, we propose a hybrid RS composed of three modules to surpass those weaknesses. The first one is rested on a powerful content clustering algorithm; which uses a Hybrid Features Selection Method (HFSM). It combines statistical and semantic relevant features to get the maximum profit from the content of items. The second module is the Collaborative Filtering (CF) one, which depends only on users’ ratings. The third one combines the previous modules to solve the problem of missing values in CF approach and to handle new-item issue. The proposed hybrid Recommender is evaluated against traditional item-based CF in different settings: no cold-start situation and a simulation of a new-item scenario (an item with few/ no ratings). The conducted experiments show the ability of the proposed hybrid recommender to deliver more accurate predictions for any item and its outperformance on the classical CF approach, which fails in cold-start situations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. www.google.com

  2. https://en.wikipedia.org/

  3. http://www.rottentomatoes.com/

  4. http://grouplens.org/datasets/movielens/

  5. https://drive.google.com/drive/folders/0B92HV51lXazyaEJGRW5Ib3pZa2c

  6. https://drive.google.com/file/d/0B92HV51lXazyMGJnMDNhdlpQZ0E/view

References

  • Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6), 734–749.

    Article  Google Scholar 

  • Basilico, J., & Hofmann, T. (2004). A joint framework for collaborative and content filtering. In Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval (pp. 550–551). ACM.

  • Benghabrit, A., Frikh, B., Ouhbi, B., Zemmouri, E.M., Behja, H. (2013). Text document clustering with hybrid feature selection. In Proceedings of international conference on information integration and web-based applications & services (p. 600). ACM.

  • Benghabrit, A., Ouhbi, B., Behja, H., Frikh, B. (2013). Text clustering using statistical and semantic data. In 2013 world congress on computer and information technology (WCCIT) (pp. 1–6). IEEE.

  • Benghabrit, A., Ouhbi, B., Frikh, B., Behja, H., et al. (2014). Exploiting statistical and semantic information for document clustering: An evaluation on feature selection. In 2014 third IEEE international colloquium in information science and technology (CIST) (pp. 96–101). IEEE.

  • Benkoussas, C., & Bellot, P. (2015). Cross-document search engine for book recommendation. CBRecSys, 15, 42–49.

    Google Scholar 

  • Chang, T.M., & Hsiao, W.F. (2013). Lda-based personalized document recommendation. In PACIS (p. 13).

  • Chow, A., Foo, M.H.N., Manai, G. (2014). Hybridrank: A hybrid content-based approach to mobile game recommendations. In CBRecSys@ RecSys (pp. 10–13).

  • Claypool, M., Gokhale, A., Miranda, T., Murnikov, P., Netes, D., Sartin, M. (1999). Combining content-based and collaborative filters in an online newspaper. In Proceedings of ACM SIGIR workshop on recommender systems, vol. 60. Citeseer.

  • Cremonesi, P., & Turrin, R. (2009). Analysis of cold-start recommendations in iptv systems. In Proceedings of the third ACM conference on recommender systems (pp. 233–236). ACM.

  • Dai, H., & Mobasher, B. (2007). Integrating semantic knowledge with web usage mining for personalization. School of Computer Science, Telecommunication, and Information Systems.

  • De Clercq, O., Schuhmacher, M., Ponzetto, S.P., Hoste, V. (2014). Exploiting framenet for content-based book recommendation. In CBRecSys at ACM RecSys, 1613-0073 (pp. 14–21). CEUR-WS.

  • De Pessemier, T., Vanhecke, K., Martens, L. (2014). A hybrid strategy for privacy-preserving recommendations for mobile shopping. In 1st workshop on new trends in content-based recommender systems (CBRecSys 2014), co-located with the 8th ACM conference on recommender systems (RecSys 2014) (Vol. 1245, pp. 22–25).

  • Dias, R., Fonseca, M.J., Cunha, R. (2014). A user-centered music recommendation approach for daily activities. In CBRecSys@ RecSys (pp. 26–33).

  • Fernández-Tobías, I., & Cantador, I. (2014). Exploiting social tags in matrix factorization models for cross-domain collaborative filtering. In CBRecSys@ RecSys (pp. 34–41).

  • Hdioud, F., Frikh, B., Ouhbi, B. (2012). A comparison study of some algorithms in recommender systems. In 2012 colloquium in information science and technology.

  • Hdioud, F., Frikh, B., Ouhbi, B. (2013). Multi-criteria recommender systems based on multi-attribute decision making. In Proceedings of international conference on information integration and web-based applications & services (p. 203). ACM.

  • Hdioud, F., Frikh, B., Ouhbi, B. (2014). Bootstrapping recommender systems based on a multi-criteria decision making approach. In 2014 international conference on next generation networks and services (NGNS) (pp. 209–215). IEEE.

  • Hdioud, F., Frikh, B., Benghabrit, A., Ouhbi, B. (2016). Collaborative filtering with hybrid clustering integrated method to address new-item cold-start problem. In Intelligent distributed computing IX (pp. 285–296). Springer.

  • Jin, R., Si, L., Zhai, C. (2006). A study of mixture models for collaborative filtering. Information Retrieval, 9(3), 357–382.

    Article  Google Scholar 

  • Kula, M. (2015). Metadata embeddings for user and item cold-start recommendations. arXiv:1507.08439.

  • Kuzelewska, U. (2014). Clustering algorithms in hybrid recommender system on movielens data. Studies in Logic. Grammar and Rhetoric, 37(1), 125–139.

    Google Scholar 

  • Li, Q., & Kim, B.M. (2003). Clustering approach for hybrid recommender system. In Proceedings of IEEE/WIC international conference on web intelligence, 2003. WI 2003 (pp. 33–38). IEEE.

  • Li, Y., Lu, L., Xuefeng, L. (2005). A hybrid collaborative filtering method for multiple-interests and multiple-content recommendation in e-commerce. Expert Systems with Applications, 28(1), 67–77.

    Article  Google Scholar 

  • Li, Y., Luo, C., Chung, S.M. (2008). Text clustering with feature selection by using statistical data. IEEE Transactions on knowledge and Data Engineering, 20 (5), 641–652.

    Article  Google Scholar 

  • Lops, P. (2014). Semantics-aware content-based recommender systems. In CBRecSys@ RecSys (p. 1).

  • Luostarinen, T., & Kohonen, O. (2013). Using topic models in content-based news recommender systems. In Proceedings of the 19th Nordic conference of computational linguistics (NODALIDA 2013); May 22-24; 2013; Oslo University; Norway. NEALT Proceedings Series 16, 085 (pp. 239–251). Linköping University Electronic Press.

  • Melville, P., Mooney, R.J., Nagarajan, R. (2002). Content-boosted collaborative filtering for improved recommendations. In Aaai/iaai (pp. 187–192).

  • Mobasher, B., Jin, X., Zhou, Y. (2004). Semantically enhanced collaborative filtering on the web. In Web mining: from web to semantic web (pp. 57–76). Springer.

  • Musto, C., Basile, P., Lops, P., De Gemmis, M., Semeraro, G. (2014). Linked open data-enabled strategies for top-n recommendations. In CBRecSys@ RecSys (pp. 49–56).

  • Nikolenko, S. (2015). Svd-lda: Topic modeling for full-text recommender systems. In Mexican international conference on artificial intelligence (pp. 67–79). Springer.

  • Nikolenko, S.I. (2016). Artm vs. lda: an svd extension case study. In 5th conference on analysis of images, social networks, and text (AIST 2016). Springer.

  • Park, Y.J., & Tuzhilin, A. (2008). The long tail of recommender systems and how to leverage it. In Proceedings of the 2008 ACM conference on recommender systems (pp. 11–18). ACM.

  • Park, S.T., & Chu, W. (2009). Pairwise preference regression for cold-start recommendation. In Proceedings of the third ACM conference on recommender systems (pp. 21–28). ACM.

  • Popescul, A., Pennock, D.M., Lawrence, S. (2001). Probabilistic models for unified collaborative and content-based recommendation in sparse-data environments. In Proceedings of the seventeenth conference on uncertainty in artificial intelligence (pp. 437–444). Morgan Kaufmann Publishers Inc.

  • Poussevin, M., Guigue, V., Gallinari, P. (2014). Extended recommendation framework: Generating the text of a user review as a personalized summary. arXiv:1412.5448.

  • Puntheeranurak, S., & Tsuji, H. (2007). A multi-clustering hybrid recommender system. In 7th IEEE international conference on computer and information technology. CIT 2007 (pp. 223–228). IEEE.

  • Sahebi, S., & Walker, T. (2014). Content-based cross-domain recommendations using segmented models. In CBRecSys@ RecSys (pp. 57–64).

  • Saveski, M., & Mantrach, A. (2014). Item cold-start recommendations: learning local collective embeddings. In Proceedings of the 8th ACM conference on recommender systems (pp. 89–96). ACM.

  • Schein, A.I., Popescul, A., Ungar, L.H., Pennock, D.M. (2002). Methods and metrics for cold-start recommendations. In Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrieval (pp. 253–260). ACM.

  • Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys (CSUR), 34(1), 1–47.

    Article  Google Scholar 

  • Shani, G., & Gunawardana, A. (2011). Evaluating recommendation systems. In Recommender systems handbook (pp. 257–297). Springer.

  • Strehl, A., Ghosh, J., Mooney, R. (2000). Impact of similarity measures on web-page clustering. In Workshop on artificial intelligence for web search (AAAI 2000) (pp. 58–64).

  • Sun, D., Li, C., Luo, Z. (2011). A content-enhanced approach for cold-start problem in collaborative filtering. In 2011 2nd international conference on artificial intelligence, management science and electronic commerce (AIMSEC) (pp. 4501–4504). IEEE.

  • Truong, K., Ishikawa, F., Honiden, S. (2007). Improving accuracy of recommender system by item clustering. IEICE Transactions on Information and Systems, 90(9), 1363–1373.

    Article  Google Scholar 

  • Ungar, L.H., & Foster, D.P. (1998). Clustering methods for collaborative filtering. In AAAI workshop on recommendation systems (Vol. 1, pp. 114–129).

  • Wen, J., & Zhou, W. (2012). An improved item-based collaborative filtering algorithm based on clustering method. Journal of Computational Information Systems, 8(2), 571–578.

    Google Scholar 

  • Willett, P. (2006). The porter stemming algorithm: then and now. Program, 40 (3), 219–223.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hdioud Ferdaous.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ferdaous, H., Bouchra, F., Brahim, O. et al. Recommendation using a clustering algorithm based on a hybrid features selection method. J Intell Inf Syst 51, 183–205 (2018). https://doi.org/10.1007/s10844-017-0493-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-017-0493-0

Keywords

Navigation