Abstract
Social collaborative filtering recommender systems extend the traditional user-to-item interaction with explicit user-to-user relationships, thereby allowing for a wider exploration of correlations among users and items, that potentially lead to better recommendations. A number of methods have been proposed in the direction of exploring the social network, either locally (i.e. the vicinity of each user) or globally. In this paper, we propose a novel methodology for collaborative filtering social recommendation that tries to combine the merits of both the aforementioned approaches, based on the soft-clustering of the Friend-of-a-Friend (FoaF) network of each user. This task is accomplished by the non-negative factorization of the adjacency matrix of the FoaF graph, while the edge-centric logic of the factorization algorithm is ameliorated by incorporating more general structural properties of the graph, such as the number of edges and stars, through the introduction of the exponential random graph models. The preliminary results obtained reveal the potential of this idea.
Similar content being viewed by others
References
Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749. doi:10.1109/TKDE.2005.99
Alexandridis G, Siolas G, Stafylopatis A (2013) Improving social recommendations by applying a personalized item clustering policy. In: Proceedings of the fifth ACM RecSys workshop on recommender systems and the social web co-located with the 7th ACM conference on recommender systems (RecSys 2013), Hong Kong, China, 13 Oct 2013. http://ceur-ws.org/Vol-1066/Paper1
Alexandridis G, Siolas G, Stafylopatis A (2015) Accuracy versus novelty and diversity in recommender systems: a nonuniform random walk approach. In: Ulusoy O, Tansel AU, Arkun E (eds) Recommendation and search in social networks, lecture notes in social networks. Springer, Berlin, pp 41–57. doi:10.1007/978-3-319-14379-8_3
Alpaydin E (2014) Introduction to machine learning, third, 3rd edn. MIT Press, Cambridge
Bellogin A, Cantador I, Diez F, Castells Chavarriaga E (2011) An empirical comparison of social, collaborative filtering, and hybrid recommenders. ACM TIST 4:14
Bennett J, Lanning S (2007) The netflix prize. In: Proceedings of the KDD Cup Workshop 2007, ACM, New York, pp 3–6. http://www.cs.uic.edu/~liub/KDD-cup-2007/NetflixPrize-description
Cantador I, Brusilovsky P, Kuflik T (2011) 2nd workshop on information heterogeneity and fusion in recommender systems (hetrec 2011). In: Proceedings of the 5th ACM conference on recommender systems, ACM, New York, RecSys 2011
de Wit JJ (2008) Evaluating recommender systems—an evaluation framework to predict user satisfaction for recommender systems in an electronic programme guide context. Master’s thesis, University of Twente
Desrosiers C, Karypis G (2011) A comprehensive survey of neighborhood-based recommendation methods. In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommender systems handbook. Springer, New York, pp 107–144. doi:10.1007/978-0-387-85820-3_4
Golbeck JA (2005) Computing and applying trust in web-based social networks. PhD thesis, College Park, aAI3178583
Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: Proceedings of the 2008 Eighth IEEE international conference on data mining, IEEE Computer Society, Washington, ICDM ’08, pp 263–272. doi:10.1109/ICDM.2008.22
Hubbard J (1959) Calculation of partition functions. Phys Rev Lett 3:77–78. doi:10.1103/PhysRevLett.3.77
Jamali M (2010) The flixster dataset. http://www.cs.sfu.ca/~sja25/personal/datasets/
Jamali M, Ester M (2009) Trustwalker: a random walk model for combining trust-based and item-based recommendation. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, KDD ’09, pp 397–406. doi:10.1145/1557019.1557067
Jamali M, Ester M (2010) A matrix factorization technique with trust propagation for recommendation in social networks. In: Proceedings of the Fourth ACM conference on recommender systems, ACM, New York, RecSys ’10, pp 135–142. doi:10.1145/1864708.1864736
Konstas I, Stathopoulos V, Jose JM (2009) On social networks and collaborative recommendation. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval, ACM, New York, SIGIR ’09, pp 195–202. doi:10.1145/1571941.1571977
Koren Y (2008) Factorization meets the neighborhood: a multifaceted collaborative filtering model. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, KDD ’08, pp 426–434. doi:10.1145/1401890.1401944
Lee D, Seung H (1999) Learning the parts of objects by non-negative matrix factorization. Nature. http://www.nature.com/nature/journal/v401/n6755/abs/401788a0.html
Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. In: In NIPS, MIT Press, pp 556–562
Ma H, King I, Lyu MR (2009) Learning to recommend with social trust ensemble. In: Proceedings of the 32nd International ACM SIGIR conference on research and development in information retrieval, ACM, New York, SIGIR ’09, pp 203–210. doi:10.1145/1571941.1571978
Massa P, Avesani P (2009) Trust metrics in recommender systems. In: Golbeck J (ed) Computing with social trust, human computer interaction series. Springer, London, pp 259–285. doi:10.1007/978-1-84800-356-9_10
Newman MEJ (2010) Networks: an introduction, 1st edn. Oxford University Press, Oxford
Nunez-Gonzalez JD, Grana M, Apolloni B (2015) Reputation features for trust prediction in social networks. Neurocomputing 166:1–7. doi:10.1016/j.neucom.2014.10.099
Park J, Newman M (2004a) Solution of the two-star model of a network. Phys Rev E 70(066):146. doi:10.1103/PhysRevE.70.066146
Park J, Newman M (2004b) Statistical mechanics of networks. Phys Rev E 70(6):066117. doi:10.1103/PhysRevE.70.066117 cond-mat/0405566
Park J, Newman MEJ (2005) Solution for the properties of a clustered network. Phys Rev E 72(2):026136. doi:10.1103/PhysRevE.72.026136 cond-mat/0412579
Psorakis I, Roberts S, Ebden M, Sheldon B (2011) Overlapping community detection using bayesian non-negative matrix factorization. Phys Rev E 83(066):114. doi:10.1103/PhysRevE.83.066114
Resnick P, Iacovou N, Suchak M, Bergstrom P, Riedl J (1994) Grouplens: an open architecture for collaborative filtering of netnews. In: Proceedings of the 1994 ACM conference on computer supported cooperative work, ACM, New York, CSCW ’94, pp 175–186. doi:10.1145/192844.192905
Robins G, Pattison P, Kalish Y, Lusher D (2007) An introduction to exponential random graph (p*) models for social networks. Soc Netw 29(2):173–191. doi:10.1016/j.socnet.2006.08.002 (special section: advances in exponential random graph (p*) models)
Shi Y, Larson M, Hanjalic A (2010) List-wise learning to rank with matrix factorization for collaborative filtering. In: Proceedings of the fourth ACM conference on recommender systems, ACM, New York, RecSys ’10, pp 269–272. doi:10.1145/1864708.1864764
Wang YX, Zhang YJ (2013) Nonnegative matrix factorization: a comprehensive review. IEEE Trans Knowl Data Eng 25(6):1336–1353. doi:10.1109/TKDE.2012.51
Weimer M, Karatzoglou A, Le QV, Smola A (2007) Cofirank maximum margin matrix factorization for collaborative ranking. In: Proceedings of the 20th international conference on neural information processing systems, Curran Associates Inc., USA, NIPS’07, pp 1593–1600. http://dl.acm.org/citation.cfm?id=2981562.2981762
Yang X, Steck H, Guo Y, Liu Y (2012) On top-k recommendation using social networks. In: Proceedings of the sixth ACM conference on recommender systems, ACM, New York, RecSys ’12, pp 67–74. doi:10.1145/2365952.2365969
Yang X, Guo Y, Liu Y, Steck H (2014) A survey of collaborative filtering based social recommender systems. Comput Commun 41:1–10. doi:10.1016/j.comcom.2013.06.009
Zhou Y, Wilkinson D, Schreiber R, Pan R (2008) Large-scale parallel collaborative filtering for the netflix prize. In: Proceedings of the 4th international conference on algorithmic aspects in information and management, Springer-Verlag, Berlin, Heidelberg, AAIM ’08, pp 337–348. doi:10.1007/978-3-540-68880-8_32
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: G. Karypis.
Appendices
Appendix 1: Approximating the free energy of the 2-star model
The analysis that follows aims to show how the exact form of the free energy of the 2-star model is derived. Starting with the Hamiltonian of the aforementioned model (Eqs. 16 and 32 below)
the partition function Z becomes
According to Park and Newman (2004b), sums like that of Eq. 33 (containing terms of the form of \(\mathrm {e}^{k^2}\)) are encountered in the study of interacting quantum systems and may be calculated through the application of the Hubbard–Stratonovich transformation (Hubbard 1959). The said transformation is used for the conversion of a particle theory (in this case, the node degree k) to the corresponding field theory, through the introduction of an auxiliary scalar field (in this case, \(\phi _i\), as it will be demonstrated next).
The Hubbard-Stratonovich transformation is based upon the Gaussian Integrals, that may be written in any of the following two forms
with the real constant \(\alpha \) taking non-negative values (\(\alpha > 0\)). Substituting \(\alpha \) and x according to Eq. 36 below
and subsequently plugging them into the Gaussian Integral of Eq. 34, yields
In order to eliminate the integral in Eq. 37, the second form of the Gaussian Integral is being used (Eq. 35). Substituting the constants \(\alpha , b, c\) and the unknown variable x according to Eq. 38 below
yields
The second term of the left-hand side of Eq. 37 in conjunction with Eq. 39, becomes
and Eq. 37 in conjunction with Eq. 40 becomes
Taking the product of the left-hand side of Eq. 41 for all n nodes of the graph and making use of the property of iterated integrals for multiple-variable functions, yields
The integral of the right-hand side of Eq. 42 is once again calculated based on quantum mechanics (Park and Newman 2004a). More specifically, every \(\phi _i\) is thought to symbolize the contribution of a respective field during the movement of a particle in an one-dimensional system. Consequently, the effect of the overall field in the particle movement is approximated by the superposition of each individual field \(\phi _i\), which is mathematically formulated as a path integral (Eq. 43)
As a consequence, Eq. 42 is rewritten in the form
Substituting Eq. 44 into Eq. 33, the partition function Z becomes
In the above equation, the order of the integration and the sum of all the possible graphs in the model has been interchanged. It should also be noted that the term containing the sum of the squares of the fields \(\phi _i\) is independent of the possible configurations of the graphs in the model and therefore Eq. 45 may take the following form
At this point, the sum of all possible model configurations has to be evaluated and the second term of the product within the integral of Eq. 46 is further analyzed as follows
and
Substituting Eq. 47 into Eq. 46, yields
where \(\mathscr {H}(\phi )\) is the effective hamiltonian
The importance of Eq. 48 above lies within the fact that it has been made possible to transform the initial model (Eq. 33) to a field theory of a continuous variable (evaluated at n points). Unfortunately, the integral of the equation above cannot be calculated in closed form (Park and Newman 2004b), but only through approximation techniques, like the mean-field theory.
1.1 Appendix 1.1: Mean-field theory
The simplest possible approximation is that of the mean-field, under which fluctuations in the field are ignored and \(\phi _i\) is always set to its most probable value, located at the saddle point where the first derivative is equal to zero (Park and Newman 2004a).
Using the identity
the following equation emerges
which has the symmetric solution \(\phi _0 = \phi _i\) for every i
In this case, the path integral of Eq. 48 is simplified to n independent Gaussian Integrals, in which case the partition function becomes
and finally, the free energy, becomes
Appendix 2: Introducing ERGM into Bayesian NMF
Having chosen the likelihood function (Eq. 25) and the a-priori distribution (Eq. 24), we may employ the classic formula of Bayesian Inference (Eq. 6) to approximate the a-posteriori probability of the model parameters (elements of matrix \(\widetilde{A}=WH\)), given the data (elements of matrix A) and the hyper-parameters (\(\Theta \))
The partition function Z is independent of the parameters \(\widetilde{a_{ij}}\) and the data \(a_{ij}\) of the model. Hyper-parameter \(\Theta \) is also deterministically defined (Eq. 24). Therefore, Eq. 54 is simplified to
The element \(\widetilde{a_{ij}}\) of matrix \(\widetilde{A}\) is computed from the inner product of \(i^{\text{ th }}\) row vector of matrix W times the \(j^{\text{ th }}\) column vector of matrix H
The objective is to find those values for \(\mathbf {w_i}^\top ,\mathbf {h_j}\) that maximize the a-posteriori probability of the parameters of the model (right-hand side of Eq. 56). As NMF has been defined as a minimization problem (Eq. 5), Eq. 56 above must be converted to an equivalent minimization problem. This conversion is achieved by taking the negative natural logarithm of the aforementioned equation (Wang and Zhang 2013)
Using the Stirling Formula \(\left( \ln (x!) = x\ln x - x\right) \) for the term \(\ln (a_{ij}!)\), yields
The gradient of Eq. 58 with respect to vectors \(\mathbf {w_i}^\top , \mathbf {h_j}\) is computed as follows
and vectors \(\mathbf {w_i}^\top , \mathbf {h_j}\) are updated according to the multiplicative update rules below, where the update factor is the ratio of the negative component of the gradient versus the positive component (Lee and Seung 2000)
yielding to the following update rules for the basis and coefficient matrices W, H
Rights and permissions
About this article
Cite this article
Alexandridis, G., Siolas, G. & Stafylopatis, A. Enhancing social collaborative filtering through the application of non-negative matrix factorization and exponential random graph models. Data Min Knowl Disc 31, 1031–1059 (2017). https://doi.org/10.1007/s10618-017-0504-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-017-0504-3