Abstract
Implicit feedback-based recommendation problems, typically set in real-world applications, recently have been receiving more attention in the research community. From the practical point of view, scalability of such methods is crucial. However, factorization-based algorithms efficient in explicit rating data applied directly to implicit data are computationally inefficient; therefore, different techniques are needed to adapt to implicit feedback. For alternating least squares (ALS) learning, several research contributions have proposed efficient adaptation techniques for implicit feedback. These algorithms scale linearly with the number of nonzero data points, but cubically in the number of features, which is a computational bottleneck that prevents the efficient usage of accurate high factor models. Also, map-reduce type big data techniques are not viable with ALS learning, because there is no known technique that solves the high communication overhead required for random access of the feature matrices. To overcome this drawback, here we present two generic approximate variants for fast ALS learning, using conjugate gradient (CG) and coordinate descent (CD). Both CG and CD can be coupled with all methods using ALS learning. We demonstrate the advantages of fast ALS variants on iTALS, a generic context-aware algorithm, which applies ALS learning for tensor factorization on implicit data. In the experiments, we compare the approximate techniques with the base ALS learning in terms of training time, scalability, recommendation accuracy, and convergence. We show that the proposed solutions offer a trade-off between recommendation accuracy and speed of training time; this makes it possible to apply ALS-based methods efficiently even for billions of data points.
Similar content being viewed by others
Notes
User purchased an item or viewed an product page, etc. Interactions also called events or transactions.
It is beneficial if the data are stored in the shared memory as well, but it can be stored on disk as well, if properly indexed.
Here we assumed a relatively high density of \({\sim }1\,\%\), 100 K for users and 45 K for items that is realistic for \({\sim }45\) M record.
With proper weighting scheme, the iTALS could be used with explicit feedback as well.
\(DN^+=\sum _{i=1}^{D}{S_i}\) means that we only have one event/example for each user, for each item and each context state. In this case, CF method are not applicable due to sparseness.
The complexity of Algorithm 4.1 is \(O(N_EN_IK)\) that is \(O\left( (K^2+N^+_jK)N_I\right) \) in our case for one feature vector.
Data were collected by the service provider of an online grocery store and a vod store, respectively, by monitoring the purchases in the system. There were no recommender systems active during the data collection period.
This value is 1.0 at TV1 and TV2. This is possibly due to preprocessing by the original authors that removed duplicate events.
The actual speedup and improvement in scalability depend on the efficiency of certain key steps (e.g., matrix-vector multiplication for CG). These may differ from algorithm to algorithm.
With fixed list length and test set, these values are proportional to the recall@20 value.
In the following sense: \(N_I\) values relative to the number of features. That is, if K is lower/higher, then approximate methods reach the training time of ALS at lower/higher \(N_I\) values.
References
Adomavicius G, Ricci F (2009) Workshop on context-aware recommender systems (CARS-2009). In: Recsys’09: ACM conference on recommender systems, pp 423–424
Adomavicius G, Sankaranarayanan R, Sen S, Tuzhilin A (2005) Incorporating contextual information in recommender systems using a multidimensional approach. ACM Trans Inf Syst 23(1):103–145
Adomavicius G, Tuzhilin A (2008) Context-aware recommender systems. In: Recsys’08: ACM conference on recommender systems, pp 335–336
Bader R, Neufeld E, Woerndl W, Prinz V (2011) Context-aware POI recommendations in an automotive scenario using multi-criteria decision making methods. In: CaRR’11: workshop on context-awareness in retrieval and recommendation, pp 23–30
Balassi M, Pálovics R, Benczúr AA (2014) Distributed frameworks for alternating least squares. In: Proceedings of the 2nd large scale recommender systems workshop at recsys 2014, Foster City
Celma O (2010) Music recommendation and discovery in the long tail. Springer, New York
Cremonesi P, Turrin R (2009) Analysis of cold-start recommendations in IPTV systems. In: Recsys’09: ACM conference on recommender systems
Dias R, Fonseca MJ (2013) Improving music recommendation in session-based collaborative filtering by using temporal context. In: 2013 IEEE 25th international conference on tools with artificial intelligence (ICTAI), IEEE, pp 783–788
Hestenes MR, Stiefel E (1952) Methods of conjugate gradients for solving linear systems. J Res Natl Bur Stand 49(6):409–436
Hidasi B (2014) Factorization models for context-aware recommendations. Infocommun J VI(4):27–34
Hidasi B, Tikk D (2012) Fast ALS-based tensor factorization for context-aware recommendation from implicit feedback. ECML-PKDD’12, Part II’, number 7524 in ‘LNCS. Springer, New York, pp 67–82
Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: ICDM’08: IEEE international conference on data mining, pp 263–272
Jahrer M, Töscher A (2011) Collaborative filtering ensemble for ranking. In: KDD Cup Workshop at 17th ACM SIGKDD’11
Karatzoglou A, Amatriain X, Baltrunas L, Oliver N (2010) Multiverse recommendation: N-dimensional tensor factorization for context-aware collaborative filtering. In: Recsys’10: ACM conference on recommender systems, pp 79–86
Koren Y, Bell R (2011) Advances in collaborative filtering. In: Ricci F et al (eds) Recommender systems handbook. Springer, New York, pp 145–186
Lathauwer L, Moor B, Vandewalle J (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21(4):1253–1278
Little RJA, Rubin DB (1987) Statistical analysis with missing data. Willey, Hoboken
Liu NN, Zhao BCM, Yang Q (2010) Adapting neighborhood and matrix factorization models for context aware recommendation. In: CAMRa’10: workshop on context-aware movie recommendation, pp 7–13
Liu Q, Chen T, Cai J, Yu D (2012) Enlister: baidu’s recommender system for the biggest Chinese Q&A website. In: RecSys-12: proceedings of the 6th ACM conference on recommender systems, pp 285–288
Lommatzsch A (2014) Real-time news recommendation using context-aware ensembles. In: de Rijke M, Kenter T, de Vries A, Zhai C, de Jong F, Radinsky K, Hofmann K (eds) Advances in information retrieval of lecture notes in computer science, vol 8416. Springer, New York, pp 51–62
Nguyen TV, Karatzoglou A, Baltrunas L (2014) Gaussian process factorization machines for context-aware recommendations. In: SIGIR-14: ACM SIGIR conference on research and development in information retrieval, pp 63–72
Pan R, Zhou Y, Cao B, Liu NN, Lukose RM, Scholz M, Yang Q (2008) One-class collaborative filtering. In: ICMD’08: 8th IEEE international conference on data mining, pp 502–511
Pilászy I, Zibriczky D, Tikk D (2010) Fast ALS-based matrix factorization for explicit and implicit feedback datasets. In: Recsys’10: ACM conference on recommender systems, pp 71–78
Rendle S (2012) Factorization machines with libFM. ACM Trans Intell Syst Technol (TIST) 3(3):57
Rendle S, Freudenthaler C, Gantner Z, Schmidt-Thieme L (2009) BPR: Bayesian personalized ranking from implicit feedback. In: UAI’09: 25th conference on uncertainty in artificial intelligence, pp 452–461
Rendle S, Gantner Z, Freudenthaler C, Schmidt-Thieme L (2011) Fast context-aware recommendations with factorization machines. In: SIGIR’11: ACM international conference on research and development in information, pp 635–644
Rendle S, Schmidt-Thieme L (2010) Pairwise interaction tensor factorization for personalized tag recommendation. In: WSDM’10: ACM international conference on web search and data mining, pp 81–90
Ricci F (ed) (2011) Recommender systems handbook. Springer, New York
Said A, Berkovsky S, Luca EWD (2010) Putting things in context: challenge on context-aware movie recommendation. In: CAMRa’10: workshop on context-aware movie recommendation, pp 2–6
Shi Y, Karatzoglou A, Baltrunas L, Larson M, Hanjalic A, Oliver N (2012) TFMAP: optimizing MAP for top-N context-aware recommendation. SIGIR’12: ACM SIGIR conference on research and development in information retrieval. ACM, New York, pp 155–164
Takács G, Pilászy I, Tikk D (2011) Applications of the conjugate gradient method for implicit feedback collaborative filtering. In: RecSys’11: ACM conference on recommender systems, pp 297–300
Takács G, Tikk D (2012) Alternating least squares for personalized ranking. In: Recsys’12: 6th ACM conference on recommender systems, pp 83–90
Zarka R, Cordier A, Egyed-Zsigmond E, Mille A (2012) Contextual trace-based video recommendations. In: Proceedings of the 21st international conference companion on world wide web, WWW ’12 Companion, ACM, New York, pp 751–754. doi:10.1145/2187980.2188196
Acknowledgments
The work leading to these results has received funding from the European Union’s Seventh Framework Programme (FP7/2007-2013) under CrowdRec Grant Agreement No. 610594. The authors would like to thank Martha Larson for her useful comments on the paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hidasi, B., Tikk, D. Speeding up ALS learning via approximate methods for context-aware recommendations. Knowl Inf Syst 47, 131–155 (2016). https://doi.org/10.1007/s10115-015-0863-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-015-0863-2