Skip to main content
Log in

Scalability and sparsity issues in recommender datasets: a survey

  • Survey Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Recommender systems have been widely used in various domains including movies, news, music with an aim to provide the most relevant proposals to users from a variety of available options. Recommender systems are designed using techniques from many fields, some of which are: machine learning, information retrieval, data mining, linear algebra and artificial intelligence. Though in-memory nearest-neighbor computation is a typical approach for collaborative filtering due to its high recommendation accuracy; its performance on scalability is still poor given a huge user and item base and availability of only few ratings (i.e., data sparsity) in archetypal merchandising applications. In order to alleviate scalability and sparsity issues in recommender systems, several model-based approaches were proposed in the past. However, if research in recommender system is to achieve its potential, there is a need to understand the prominent techniques used directly to build recommender systems or for preprocessing recommender datasets, along with its strengths and weaknesses. In this work, we present an overview of some of the prominent traditional as well as advanced techniques that can effectively handle data dimensionality and data sparsity. The focus of this survey is to present an overview of the applicability of some advanced techniques, particularly clustering, biclustering, matrix factorization, graph-theoretic, and fuzzy techniques in recommender systems. In addition, it highlights the applicability and recent research works done using each technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. LSI is an application of SVD in document retrieval [100].

  2. For measuring the accuracy of predictions, Mean Absolute Error (MAE) is a popular measure. MAE defines the deviation of predictions given by recommender from the actual preferences given by the user.

  3. A rating matrix of n users and m items is represented by n rows and m columns. Cell entries or elements in the rating matrix depict the rating of users on items. The rating rui shows rating of a user u on an item i.

  4. Demographic filtering creates demographic profile of a user based on his attributes such as age, gender, occupation etc.

  5. http://www.netflixprize.com/.

  6. https://en.wikipedia.org/wiki/Stochastic_gradient_descent.

  7. Available at http://www.grouplens.org/data.

  8. An algorithm that orders the items in I (set of all items) for a user u \( \in {\text{U}} \) according to some similarities between vertex u and the vertices in I, which are defined by the structure of graph G is known as a scoring algorithm [26].

References

  1. Adams E, Walczak B, Vervaet C, Risha PG, Massart DL (2002) Principal component analysis of dissolution data with missing elements. Int J Pharm 234(1):169–178

    Google Scholar 

  2. Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749

    Google Scholar 

  3. Adomavicius G, Tuzhilin A (2015) Context-aware recommender systems. In: Recommender systems handbook. Springer, New York, pp. 191–226

    Google Scholar 

  4. Aggarwal CC, Wolf JL, Wu KL, Yu PS (1999) Horting hatches an egg: A new graph-theoretic approach to collaborative filtering. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 201–212

  5. Aggarwal CC, Reddy CK (eds) (2013) Data clustering: algorithms and applications. Chapman and Hall/CRC, Boston

    Google Scholar 

  6. Ahn S, Korattikara A, Liu N, Rajan S, Welling M (2015). Large-scale distributed Bayesian matrix factorization using stochastic gradient MCMC. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. pp 9–18. ACM

  7. Al Mamunur Rashid SKL, Karypis G, Riedl J (2006) ClustKNN: a highly scalable hybrid model-and memory-based CF algorithm. In: Proceeding of WebKDD

  8. Alqadah F, Reddy CK, Hu J, Alqadah HF (2015) Biclustering neighborhood-based collaborative filtering method for top-n recommender systems. Knowl Inf Syst 44(2):475–491

    Google Scholar 

  9. Altingovde IS, Subakan ÖN, Ulusoy Ö (2013) Cluster searching strategies for collaborative recommendation systems. Inf Process Manag 49(3):688–697

    Google Scholar 

  10. Amatriain X, Jaimes A, Oliver N, Pujol JM (2011) Data mining methods for recommender systems. In: Recommender systems handbook. Springer, New York, pp. 39–71

    Google Scholar 

  11. Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) OPTICS: ordering points to identify the clustering structure. In ACM Sigmod record, vol. 28, no. 2. ACM, pp. 49–60

  12. Baltrunas L, Ludwig B, Ricci F (2011) Matrix factorization techniques for context aware recommendation. In Proceedings of the fifth ACM conference on recommender systems. ACM, pp 301–304

  13. Bellogin A, Parapar J (2012) Using graph partitioning techniques for neighbour selection in user-based collaborative filtering. In: Proceedings of the sixth ACM conference on recommender systems. ACM, pp 213–216

  14. Bilge A, Polat H (2013) A comparison of clustering-based privacy-preserving collaborative filtering schemes. Appl Soft Comput 13(5):2478–2489

    Google Scholar 

  15. Birtolo C, Ronca D (2013) Advances in clustering collaborative filtering by means of Fuzzy C-means and trust. Expert Syst Appl 40(17):6997–7009

    Google Scholar 

  16. Bobadilla J, Ortega F, Hernando A, Gutiérrez A (2013) Recommender systems survey. Knowl Based Syst 46:109–132

    Google Scholar 

  17. Bradley PS, Fayyad UM, Reina C (1998) Scaling clustering algorithms to large databases. In: KDD, pp 9–15

  18. Breese JS, Heckerman D, Kadie C (1998) Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the fourteenth conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., pp. 43–52

  19. Burke R (2002) Hybrid recommender systems: survey and experiments. User Model User Adap Inter 12(4):331–370

    MATH  Google Scholar 

  20. Cacheda F, Carneiro V, Fernández D, Formoso V (2011) Comparison of collaborative filtering algorithms: limitations of current techniques and proposals for scalable, high-performance recommender systems. ACM Trans Web (TWEB) 5(1):2

    Google Scholar 

  21. Cantador I, Bellogín A, Castells P (2008) A multilayer ontology-based hybrid recommendation model. AI Commun 21(2–3):203–210

    MathSciNet  MATH  Google Scholar 

  22. Cao Y, Li Y (2007) An intelligent fuzzy-based recommendation system for consumer electronic products. Expert Syst Appl 33(1):230–240

    Google Scholar 

  23. Chee SHS, Han J, Wang K (2001) Rectree: An efficient collaborative filtering method. In: International conference on data warehousing and knowledge discovery. Springer, Berlin, pp. 141–151

    Google Scholar 

  24. Cheng Y, Church GM (2000) Biclustering of expression data. In: Ismb, vol. 8, no. 2000, pp 93–103

  25. Codina V, Ricci F, Ceccaroni L (2016) Distributional semantic pre-filtering in context-aware recommender systems. User Model User Adap Inter 26(1):1–32

    Google Scholar 

  26. Cooper C, Lee SH, Radzik T, Siantos Y (2014) Random walks in recommender systems: exact computation and simulations. In: Proceedings of the 23rd international conference on world wide web. ACM, pp 811–816

  27. Cornelis C, Lu J, Guo X, Zhang G (2007) One-and-only item recommendation with fuzzy logic techniques. Inf Sci 177(22):4906–4921

    MATH  Google Scholar 

  28. Cremonesi P, Koren Y, Turrin R (2010) Performance of recommender algorithms on top-n recommendation tasks. In: Proceedings of the fourth ACM conference on recommender systems. ACM, pp 39–46

  29. de Castro PA, de França FO, Ferreira H M, Von Zuben FJ (2007) Evaluating the performance of a biclustering algorithm applied to collaborative filtering-a comparative analysis. In 7th international conference on hybrid intelligent systems, 2007. HIS 2007. IEEE, pp 65–70

  30. de França FO, Coelho GP, Von Zuben FJ (2009) Coherent recommendations using biclustering. In: Proceedings of the XXX CongressoIbero-Latino-Americano de MétodosComputacionaisemEngenharia (CILAMCE), pp 1–15

  31. Deodhar M, Ghosh J (2010) SCOAL: a framework for simultaneous co-clustering and learning from complex data. ACM Trans Knowl Discov Data (TKDD) 4(3):11

    Google Scholar 

  32. Dhillon IS (2001). Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 269–274

  33. Diao Q, Qiu M, Wu CY, Smola AJ, Jiang J, Wang C (2014) Jointly modeling aspects, ratings and sentiments for movie recommendation (jmars). In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp. 193–202

  34. Ding C, He X, Simon HD (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: Proceedings of the 2005 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, pp 606–610

  35. Ding C, He X (2004) K-means clustering via principal component analysis. In: Proceedings of the twenty-first international conference on machine learning, p 29. ACM

  36. Esslimani I, Brun A, Boyer A (2009). A collaborative filtering approach combining clustering and navigational based correlations. In: WEBIST, pp 364–369

  37. Fabricio O, Ferreira HM, Von Zuben FJ (2007) Applying biclustering to perform collaborative filtering. In: Seventh international conference on intelligent systems design and applications, 2007. ISDA 2007. IEEE, pp 421–426

  38. Fouss F, Pirotte A, Renders JM, Saerens M (2007) Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans Knowl Data Eng 19(3):355

    Google Scholar 

  39. Frémal S, Lecron F (2017) Weighting strategies for a recommender system using item clustering based on genres. Expert Syst Appl 77:105–113

    Google Scholar 

  40. Gai L, Lei L (2014) Dual collaborative topic modeling from implicit feedbacks. In: 2014 International conference on security, pattern analysis, and cybernetics (SPAC). IEEE, pp 395–404

  41. Gao M, Ling B, Yuan Q, Xiong Q, Yang L (2014) A robust collaborative filtering approach based on user relationships for recommendation systems. Math Probl Eng. https://doi.org/10.1155/2014/162521

    Article  Google Scholar 

  42. George T, Merugu S (2005) A scalable collaborative filtering framework based on co-clustering. In: Fifth IEEE international conference on data mining. IEEE

  43. Ghazanfar MA, Prügel-Bennett A (2014) Leveraging clustering approaches to solve the gray-sheep users problem in recommender systems. Expert Syst Appl 41(7):3261–3275

    Google Scholar 

  44. Goldberg K, Roeder T, Gupta D, Perkins C (2001) Eigentaste: a constant time collaborative filtering algorithm. Inf Retr 4(2):133–151

    MATH  Google Scholar 

  45. Gong S (2010) A collaborative filtering recommendation algorithm based on user clustering and item clustering. JSW 5(7):745–752

    Google Scholar 

  46. Guo G, Zhang J, Yorke-Smith N (2015) Leveraging multiviews of trust and similarity to enhance clustering-based recommender systems. Knowl Based Syst 74:14–27

    Google Scholar 

  47. Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques: concepts and techniques. Elsevier, New York

    MATH  Google Scholar 

  48. Haruechaiyasak C, Tipnoe C, Kongyoung S, Damrongrat C, Angkawattanawit N (2005) A dynamic framework for maintaining customer profiles in e-commerce recommender systems. In: The 2005 IEEE international conference on e-technology, e-commerce and e-service, 2005. EEE’05. Proceedings. IEEE, pp 768–771

  49. Herlocker J, Konstan JA, Riedl J (2002) An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Inf Retr 5(4):287–310

    Google Scholar 

  50. Hofmann T (2004) Latent semantic models for collaborative filtering. ACM Trans Inf Syst (TOIS) 22(1):89–115

    Google Scholar 

  51. Hoseini E, Hashemi S, Hamzeh A (2012) A levelwise spectral co-clustering algorithm for collaborative filtering. In: Proceedings of the 6th international conference on ubiquitous information management and communication. ACM, p 6

  52. Hu R, Dou W, Liu J (2014) Clubcf: a clustering-based collaborative filtering approach for big data application. IEEE Trans Emerg Top Comput 2(3):302–313

    Google Scholar 

  53. Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: Eighth IEEE international conference on data mining, 2008. ICDM’08. IEEE, pp 263–272

  54. Javari A, Jalili M (2014) Cluster-based collaborative filtering for sign prediction in social networks with positive and negative links. ACM Trans Intell Syst Technol (TIST) 5(2):24

    Google Scholar 

  55. Jiang XM, Song WG, Feng WG (2006) Optimizing collaborative filtering by interpolating the individual and group behaviors. Front WWW Res Dev APWeb 2006:568–578

    Google Scholar 

  56. Ju C, Xu C (2013) A new collaborative recommendation approach based on users clustering using artificial bee colony algorithm. Sci World J. https://doi.org/10.1155/2013/869658

    Article  Google Scholar 

  57. Kelleher J, Bridge D (2003) Rectree centroid: an accurate, scalable collaborative recommender. AICS 2003:7

    Google Scholar 

  58. Kim D, Yum BJ (2005) Collaborative filtering based on iterative principal component analysis. Expert Syst Appl 28(4):823–830

    Google Scholar 

  59. Kim KJ, Ahn H (2017) Recommender systems using cluster-indexing collaborative filtering and social data analytics. Int J Prod Res. https://doi.org/10.1080/00207543.2017.1287443

    Article  Google Scholar 

  60. Konstantopoulos T (2009) Introductory lecture notes on markov chains and random walks. Department of Mathematics, Uppsala University, 200(9)

  61. Koren Y (2008) Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 426–434

  62. Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30

    Google Scholar 

  63. Koren Y (2010) Collaborative filtering with temporal dynamics. Commun ACM 53(4):89–97

    Google Scholar 

  64. Koren Y (2010) Factor in the neighbors: scalable and accurate collaborative filtering. ACM Trans Knowl Discov Data (TKDD) 4(1):1

    Google Scholar 

  65. Kefalas P, Symeonidis P, Manolopoulos Y (2016) A graph-based taxonomy of recommendation algorithms and systems in LBSNs. IEEE Trans Knowl Data Eng 28(3):604–622

    Google Scholar 

  66. Leung CWK, Chan SCF, Chung FL (2006) A collaborative filtering framework based on fuzzy association rules and multiple-level similarity. Knowl Inf Syst 10(3):357–381

    Google Scholar 

  67. Li Q, Kim BM (2003) Clustering approach for hybrid recommender system. In: IEEE/WIC international conference on web intelligence, 2003. WI 2003. Proceedings. IEEE, pp 33–38

  68. Li T, Ding CH (2013) Nonnegative matrix factorizations for clustering: a survey

  69. Li X, Murata T (2012) Using multidimensional clustering based collaborative filtering approach improving recommendation diversity. In: Proceedings of the 2012 IEEE/WIC/ACM international joint conferences on web intelligence and intelligent agent technology, vol 03. IEEE Computer Society, pp 169–174

  70. Lilien GL, Rangaswamy A (2004) Marketing engineering: computer-assisted marketing analysis and planning. DecisionPro, State College

    Google Scholar 

  71. Linden G, Smith B, York J (2003) Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comput 7(1):76–80

    Google Scholar 

  72. Liu NN, Xiang EW, Zhao M, Yang Q (2010) Unifying explicit and implicit feedback for collaborative filtering. In: Proceedings of the 19th ACM international conference on Information and knowledge management. ACM, pp 1445–1448

  73. Lu J, Shambour Q, Xu Y, Lin Q, Zhang G (2013) A web-based personalized business partner recommendation system using fuzzy semantic techniques. Comput Intell 29(1):37–69

    MathSciNet  Google Scholar 

  74. Lu J, Wu D, Mao M, Wang W, Zhang G (2015) Recommender system application developments: a survey. Decis Support Syst 74:12–32

    Google Scholar 

  75. Lucas JP, Laurent A, Moreno MN, Teisseire M (2012) A fuzzy associative classification approach for recommender systems. Int J Uncertain Fuzziness Knowl Based Syst 20(04):579–617

    MathSciNet  Google Scholar 

  76. Madeira SC, Oliveira AL (2004) Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 1(1):24–45

    Google Scholar 

  77. Mazumder R, Hastie T, Tibshirani R (2010) Spectral regularization algorithms for learning large incomplete matrices. J Mach Learn Res 11:2287–2322

    MathSciNet  MATH  Google Scholar 

  78. Melville P, Sindhwani V (2010) Recommender systems. Encyclopedia of machine learning. Springer, US, pp 829–838

    Google Scholar 

  79. Melville P, Mooney RJ, Nagarajan R (2002) Content-boosted collaborative filtering for improved recommendations. In: Aaai/iaai, pp 187–192

  80. Merialdo AKB (1999) Clustering for collaborative filtering applications. Intelli Image Process Data Anal Inf Retr 3:199

    MATH  Google Scholar 

  81. Mnih A, Salakhutdinov RR (2008) Probabilistic matrix factorization. In: Advances in neural information processing systems, pp 1257–1264

  82. Moreira A, Santos MY, Carneiro S (2005) Density-based clustering algorithms—DBSCAN and SNN. University of Minho-Portugal, Braga

    Google Scholar 

  83. Nathanson T, Bitton E, Goldberg K (2007) Eigentaste 5.0: constant-time adaptability in a recommender system using item clustering. In: Proceedings of the 2007 ACM conference on Recommender systems. ACM, pp 149–152

  84. Ntoutsi E, Stefanidis K, Nørvåg K, Kriegel HP (2012) Fast group recommendations by applying user clustering. In: International conference on conceptual modeling. Springer, Berlin, pp 126–140

    Google Scholar 

  85. O’Connor M, Herlocker J (1999) Clustering items for collaborative filtering. In: Proceedings of the ACM SIGIR workshop on recommender systems, vol 128. UC Berkeley

  86. Paterek A (2007) Improving regularized singular value decomposition for collaborative filtering. In: Proceedings of KDD cup and workshop, vol 2007, pp 5–8

  87. Pereira ALV, Hruschka ER (2015) Simultaneous co-clustering and learning to address the cold start problem in recommender systems. Knowl Based Syst 82:11–19

    Google Scholar 

  88. Porcel C, Moreno JM, Herrera-Viedma E (2009) A multi-disciplinar recommender system to advice research resources in University Digital Libraries. Expert Syst Appl 36(10):12520–12528

    Google Scholar 

  89. Porcel C, López-Herrera AG, Herrera-Viedma E (2009) A recommender system for research resources based on fuzzy linguistic modeling. Expert Syst Appl 36(3):5173–5183

    Google Scholar 

  90. Porcel C, Herrera-Viedma E (2010) Dealing with incomplete information in a fuzzy linguistic recommender system to disseminate information in university digital libraries. Knowl Based Syst 23(1):32–39

    Google Scholar 

  91. Pham MC, Cao Y, Klamma R, Jarke M (2011) A clustering approach for collaborative filtering recommendation using social network analysis. J UCS 17(4):583–604

    Google Scholar 

  92. Rege M, Dong M, Fotouhi F (2006) Co-clustering documents and words using bipartite isoperimetric graph partitioning. In: Sixth international conference on data mining, 2006. ICDM’06. IEEE, pp 532–541

  93. Rendle S (2012) Factorization machines with libfm. ACM Trans Intell Syst Technol (TIST) 3(3):57

    Google Scholar 

  94. Said A, Bellogín A (2014) Comparative recommender system evaluation: benchmarking recommendation frameworks. In: Proceedings of the 8th ACM conference on recommender systems. ACM, pp 129–136

  95. Saito T, Kawahara K, Okada Y (2013) Recommendation method using bicluster network method. In: Proceedings of the international multiconference of engineers and computer scientists, vol 1

  96. Salakhutdinov R, Mnih A (2008) Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In: Proceedings of the 25th international conference on Machine learning. ACM, pp 880–887

  97. Sarwar B, Karypis G, Konstan J, Riedl J (2000) Application of dimensionality reduction in recommender system—a case study (no. TR-00-043). Minnesota Univ. Minneapolis Dept. of Computer Science

  98. Sarwar B, Karypis G, Konstan J, Riedl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th international conference on World Wide Web. ACM, pp 285–295

  99. Sarwar BM, Karypis G, Konstan J, Riedl J (2002) Recommender systems for large-scale e-commerce: scalable neighborhood formation using clustering. In: Proceedings of the fifth international conference on computer and information technology, vol 1

  100. Schafer J (2009) The application of data-mining to recommender systems. Encycl Data Warehous Min 1:44–48

    Google Scholar 

  101. Shepitsen A, Gemmell J, Mobasher B, Burke R (2008) Personalized recommendation in social tagging systems using hierarchical clustering. In: Proceedings of the 2008 ACM conference on Recommender systems. ACM, pp 259–266

  102. Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905

    Google Scholar 

  103. Shinde SK, Kulkarni UV (2011) Hybrid personalized recommender system using modified Fuzzy C-Means clustering algorithm. Int J Artif Intell Expert Syst (IJAE) 1(4):88

    Google Scholar 

  104. Simon F (2006) Netflix update: try this at home. Retrieved 21 June 2017 from http://sifter.org/simon/journal/20061211.html

  105. Son LH (2014) HU-FCF: a hybrid user-based fuzzy collaborative filtering method in Recommender Systems. Expert Syst Appl Int J 41(15):6861–6870

    Google Scholar 

  106. Sun Y, Fan H, Bakillah M, Zipf A (2015) Road-based travel recommendation using geo-tagged images. Comput Environ Urban Syst 53:110–122

    Google Scholar 

  107. Sundermann CV, Domingues MA, Marcacini RM, Rezende SO (2014) Using topic hierarchies with privileged information to improve context-aware recommender systems. In: 2014 Brazilian conference on intelligent systems (BRACIS). IEEE, pp 61–66

  108. Suryavanshi B, Shiri N, Mudur S (2005) A fuzzy hybrid collaborative filtering technique for web personalization. In: Proceedings of 3rd international workshop on intelligent techniques for web personalization (ITWP 2005), 19th international joint conference on artificial intelligence (IJCAI 2005), pp 1–8

  109. Suryavanshi BS, Shiri N, Mudur SP (2005) An efficient technique for mining usage profiles using relational fuzzy subtractive clustering. In: International workshop on challenges in web information retrieval and integration, 2005. WIRI’05. Proceedings. IEEE, pp 23–29

  110. Symeonidis P, Nanopoulos A, Papadopoulos A, Manolopoulos Y (2006) Nearest-biclusters collaborative filtering with constant values. In: International workshop on knowledge discovery on the web. Springer, Berlin,, pp 36–55

  111. Symeonidis P, Nanopoulos A, Papadopoulos AN, Manolopoulos Y (2008) Nearest-biclusters collaborative filtering based on constant and coherent values. Inf Retr 11(1):51–75

    Google Scholar 

  112. Terán L, Meier A (2010) A fuzzy recommender system for eElections. In: International conference on electronic government and the information systems perspective. Springer, Berlin, pp 62–76

    Google Scholar 

  113. Thong NT (2015) HIFCF: an effective hybrid model between picture fuzzy clustering and intuitionistic fuzzy recommender systems for medical diagnosis. Expert Syst Appl 42(7):3682–3701

    Google Scholar 

  114. Ungar LH, Foster DP (1998) Clustering methods for collaborative filtering. In: AAAI workshop on recommendation systems, vol 1, pp 114–129

  115. Unger M, Bar A, Shapira B, Rokach L (2016) Towards latent context-aware recommendation systems. Knowl Based Syst 104:165–178

    Google Scholar 

  116. Wang H, Wang W, Yang J, Yu PS (2002) Clustering by pattern similarity in large data sets. In: Proceedings of the 2002 ACM SIGMOD international conference on Management of data. ACM, pp 394–405

  117. Wang F, Ma S, Yang L, Li T (2006) Recommendation on item graphs. In: Sixth international conference on data mining, 2006. ICDM’06. IEEE, pp 1119–1123

  118. Wang X, He D, Chen D, Xu J (2015) Clustering-based collaborative filtering for link prediction. In: AAAI, pp 332–338

  119. Wang S, Li C, Zhao K, Chen H (2017) Context-aware recommendations with random partition factorization machines. Data Sci Eng 2(2):125–135

    Google Scholar 

  120. West JD, Wesley-Smith I, Bergstrom CT (2016) A recommendation system based on hierarchical clustering of an article-level citation network. IEEE Trans Big Data 2(2):113–123

    Google Scholar 

  121. Xu B, Bu J, Chen C, Cai D (2012) An exploration of improving collaborative recommender systems via user-item subgroups. In: Proceedings of the 21st international conference on World Wide Web. ACM, pp 21–30

  122. Xue GR, Lin C, Yang Q, Xi W, Zeng HJ, Yu Y, Chen Z (2005) Scalable collaborative filtering using cluster-based smoothing. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 114–121

  123. Yager RR (2003) Fuzzy logic methods in recommender systems. Fuzzy Sets Syst 136(2):133–149

    MathSciNet  MATH  Google Scholar 

  124. Yang J, Wang W, Wang H, Yu P (2002) /Spl delta/-clusters: capturing subspace correlation in a large data set. In: 18th International conference on data engineering, 2002. Proceedings. IEEE, pp 517–528

  125. Yao Q, Kwok JT (2015) Accelerated inexact soft-impute for fast large-scale matrix completion. In: Twenty-fourth international joint conference on artificial intelligence

  126. Yuan NJ, Zheng Y, Zhang L, Xie X (2013) T-finder: a recommender system for finding passengers and vacant taxis. IEEE Trans Knowl Data Eng 25(10):2390–2403

    Google Scholar 

  127. Zhang D, Hsu CH, Chen M, Chen Q, Xiong N, Lloret J (2014) Cold-start recommendation using bi-clustering and fusion for large-scale social recommender systems. IEEE Trans Emerg Top Comput 2(2):239–250

    Google Scholar 

  128. Zahra S, Ghazanfar MA, Khalid A, Azam MA, Naeem U, Prugel-Bennett A (2015) Novel centroid selection approaches for KMeans-clustering based recommender systems. Inf Sci 320:156–189

    MathSciNet  Google Scholar 

  129. Zhou D, Zhu S, Yu K, Song X, Tseng BL, Zha H, Giles CL (2008) Learning multiple graphs for document recommendations. In: Proceedings of the 17th international conference on World Wide Web. ACM, pp 141–150

  130. Zenebe A, Norcio AF (2009) Representation, similarity measures and aggregation methods using fuzzy sets for content-based recommender systems. Fuzzy Sets Syst 160(1):76–94

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Monika Singh.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Singh, M. Scalability and sparsity issues in recommender datasets: a survey. Knowl Inf Syst 62, 1–43 (2020). https://doi.org/10.1007/s10115-018-1254-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-018-1254-2

Keywords

Navigation