Abstract
Graphs and matrices are widely used in algorithms for social network analyses. Since the number of interactions is much less than the possible number of interactions, the graphs and matrices used in the analyses are usually sparse. In this paper, we propose an efficient implementation of a sparse-matrix computation which arises in our publicly available citation recommendation service theadvisor as well as in many other recommendation systems. The recommendation algorithm uses a sparse matrix generated from the citation graph. We observed that the nonzero pattern of this matrix is highly irregular and the computation suffers from high number of cache misses. We propose techniques for storing the matrix in memory efficiently and we reduced the number of cache misses with ordering and partitioning. Experimental results show that our techniques are highly efficient in reducing the query processing time which is highly crucial for a web service.
Similar content being viewed by others
Notes
The Katz centrality of a node i can be computed as \({\rm Katz}(i) = \sum\nolimits_{j=1}^n {\rm Katz}(i,j) = \sum\nolimits_{j=1}^n\sum\nolimits_{\ell = 1}^\infty \beta^\ell (A^\ell)_{ji}\) where A is the 0–1 adjacency matrix of the citation graph. When β is smaller than the reciprocal of the largest eigenvalue, the Katz centralities can be computed as \(((I - \beta A^T)^{-1} -I )\buildrel{\rightarrow} \over {I}\) where I is the identity matrix and \(\buildrel{\rightarrow} \over {I}\) is the identity vector.
References
Agarwal RC, Gustavson FG, Zubair M (1992) A high performance algorithm using pre-processing for the sparse matrix-vector multiplication. In: Proceedings of ACM/IEEE Supercomputing, pp 32–41
Akbudak K, Kayaaslan E, Aykanat C (2012) Hypergraph-partitioning-based models and methods for exploiting cache locality in sparse-matrix vector multiplication. CoRR abs/1202.3856
Amestoy PR, Davis TA, Duff IS (1996) An approximate minimum degree ordering algorithm. SIAM J Matrix Anal Appl 17(4):886–905
Bollen J, Rodriguez MA, de Sompel HV (2006) Journal status. Scientometrics 69(3):669–687
Buluç A, Williams S, Oliker L, Demmel J (2011) Reduced-bandwidth multithreaded algorithms for sparse matrix-vector multiplication. In: Proceedings of IEEE International Parallel and Distributed Processing Symposium, pp 721–733
Çatalyürek ÜV, Aykanat C (1999) Hypergraph-partitioning based decomposition for parallel sparse-matrix vector multiplication. IEEE Trans Parallel Distrib Syst 10:673–693
Çatalyürek ÜV, Aykanat C (1999) PaToH: a multilevel hypergraph partitioning tool, Version 3.0. Bilkent University, Computer Engineering, Ankara, Turkey. http://bmi.osu.edu/~umit/software.htm
Çatalyürek ÜV, Aykanat C (2001) A fine-grain hypergraph model for 2D decomposition of sparse matrices. In: Proceedings of IEEE International Parallel and Distributed Processing Symposium
Chipman KC, Singh AK (2009) Predicting genetic interactions with random walks on biological networks. BMC Bioinformatics 10:17
Cuthill E, McKee J (1969) Reducing the bandwidth of sparse symmetric matrices. In: Proceedings of ACM national conference, pp 157–172
Gori, M. Pucci A (2006) Research paper recommender systems: a random-walk based approach. In: Proceedings of IEEE/WIC/ACM Web Intelligence, pp 778–781
Kang U, Faloutsos C (2011) Beyond ‘caveman communities’: hubs and spokes for graph compression and mining. In: Proceedings of IEEE International Conference Data Mining, pp 300–309
Kessler MM (1963) Bibliographic coupling between scientific papers. American Documentation 14:10–25
Kim HN, El-Saddik A (2011) Personalized PageRank vectors for tag recommendations: inside FolkRank. In: Proceedings of ACM Recommender Systems, pp 45–52
Küçüktunç O, Kaya K, Saule E, Çatalyürek ÜV (2012a) Fast recommendation on bibliographic networks. In: Proceedings of Advances in Social Networks Analysis and Mining, pp 480–487
Küçüktunç O, Saule E, Kaya K, Çatalyürek ÜV (2012b) Direction awareness in citation recommendation. In: Proceedings of International Workshop on Ranking in Databases (DBRank’12) in Conjunction with VLDB’12
Lawrence S, Giles CL, Bollacker K (1999) Digital libraries and autonomous citation indexing. Computer 32:67–71
Lengauer T (1990) Combinatorial algorithms for integrated circuit layout. Wiley–Teubner, Berlin
Li J, Willett P (2009) ArticleRank: a PageRank-based alternative to numbers of citations for analyzing citation networks. Proc Assoc Inform Manag 61(6):605–618
Liben-Nowell D, Kleinberg JM (2007) The link-prediction problem for social networks. J Am Soc Inform Sci 58(7):1019–1031
Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking: bringing order to the web. TR 1999-66, Stanford InfoLab
Pan JY, Yang HJ, Faloutsos C, Duygulu P (2004) Automatic multimedia cross-modal correlation discovery. In: Proceedings of ACM SIGKDD International Conference Knowledge Discovery and Data Mining, pp 653–658
Pichel JC, Heras DB, Cabaleiro JC, Rivera FF (2005) Performance optimization of irregular codes based on the combination of reordering and blocking techniques. Parallel Comput 31(8–9):858–876
Pichel JC, Heras DB, Cabaleiro JC, Rivera FF (2009) Increasing data reuse of sparse algebra codes on simultaneous multithreading architectures. Concurr Comput Pract Experience 21(15):1838–1856
Pinar A, Heath MT (1999) Improving performance of sparse matrix-vector multiplication. In: Proceedings of ACM/IEEE Supercomputing
Small H (1973) Co-citation in the scientific literature: a new measure of the relationship between two documents. J Am Soc Inf Sci 24(4):265–269
Strohman T, Croft WB, Jensen D (2007) Recommending cictations for academic papers. In: Proceedings of International ACM SIGIR Conference Research and Development in Information Retrieval, pp 705–706
Temam O, Jalby W (1992) Characterizing the behavior of sparse algorithms on caches. In: Proceedings of ACM/IEEE Supercomputing, pp 578–587
Toledo S (1997) Improving the memory-system performance of sparse-matrix vector multiplication. IBM J Res Dev 41(6):711–726
White JB, Sadayappan P (1997) On improving the performance of sparse matrix-vector multiplication. In: Proceedings of International Conference High Performance Computing, pp 66–71
Yin Z, Gupta M, Weninger T, Han J (2010) A unified framework for link recommendation using random walks. In: Proceedings of Advances in Social Networks Analysis and Mining, pp 152–159
Yzelman AN, Bisseling RH (2009) Cache-oblivious sparse matrix–vector multiplication by using sparse matrix partitioning methods. SIAM J Sci Comput 31:3128–3154
Yzelman AN, Bisseling RH (2011) Two-dimensional cache-oblivious sparse matrix-vector multiplication. Parallel Comput 37:806–819
Acknowledgments
This work was supported in parts by the DOE grant DE-FC02-06ER2775 and by the NSF grants CNS-0643969, OCI-0904809, and OCI-0904802.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Küçüktunç, O., Kaya, K., Saule, E. et al. Fast recommendation on bibliographic networks with sparse-matrix ordering and partitioning. Soc. Netw. Anal. Min. 3, 1097–1111 (2013). https://doi.org/10.1007/s13278-013-0106-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13278-013-0106-z