Abstract
PageRank is typically computed from the power of transition matrix in a Markov Chain model. It is therefore computationally expensive, and efficient approximation methods to accelerate the computation are necessary, especially when it comes to large graphs. In this paper, we propose two sampling algorithms for PageRank efficient approximation: Direct sampling and Adaptive sampling. Both methods sample the transition matrix and use the sample in PageRank computation. Direct sampling method samples the transition matrix once and uses the sample directly in PageRank computation, whereas adaptive sampling method samples the transition matrix multiple times with an adaptive sample rate which is adjusted iteratively as the computing procedure proceeds. This adaptive sample rate is designed for a good trade-off between accuracy and efficiency for PageRank approximation. We provide detailed theoretical analysis on the error bounds of both methods. We also compare them with several state-of-the-art PageRank approximation methods, including power extrapolation and inner–outer power iteration algorithm. Experimental results on several real-world datasets show that our methods can achieve significantly higher efficiency while attaining comparable accuracy than state-of-the-art methods.
Similar content being viewed by others
References
Achlioptas D, McSherry F (2007) Fast computation of low-rank matrix approximations. J. ACM 54(2):9
Avrachenkov K, Lebedev D (2006) Pagerank of scale-free growing networks. Internet Math 3(2):207–232
Berkhin P (2005) Survey: a survey on pagerank computing. Internet Math 2(1):73–120
Benczur A, Csalogány K, Sarlós T (2005) On the feasibility of low-rank approximation for personalized pagerank. In: Proceedings of the 14th international conference on World Wide Web, Chiba, Japan, May 2005, pp 972–973
Borodin J, Roberts G, Tsaparas P (2005) Link analysis ranking: algorithms, theory, and experiments. ACM Trans Internet Technol 5: pp 231–297
Brin RMS, Page L, Winograd T (1999) The pagerank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford InfoLab, November 1999
Candès EJ, Plan Y (2010) Tight oracle bounds for low-rank matrix recovery from a minimal number of random measurements. CoRR. abs/1001.0339, 2010
Drineas P, Kannan R (2001) Fast Monte-Carlo algorithms for approximate matrix multiplication. In: 42nd annual symposium on foundations of computer science, Las Vegas, Nevada, USA, October 2001, pp 452–459
Drineas P, Kannan R, Mahoney MW (2006) Fast Monte Carlo algorithms for matrices I: approximating matrix multiplication. SIAM J Sci Comput 36:132–157
Gleich DF, Gray AP, Greif C, Lau T (2010) An inner–outer iteration for PageRank. SIAM J Sci Comput 32(1):349–371
Haveliwala T, Kamvar S, Klein D, Manning C, Golub G (2003) Computing PageRank using power extrapolation. Stanford University Technical Report, July 2003
He G, Feng H, Li C, Chen H (2010) Parallel SimRank computation on large graphs with iterative aggregation. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, Washington, DC, USA, July 2010, pp 543–552
Kamvar S, Haveliwala T, Golub G (2003) Adaptive methods for the computation of pagerank. Technical Report 2003-26, Stanford InfoLab, April 2003
Kamvar S, Haveliwala T, Manning C, Golub G (2003) Extrapolation methods for accelerating pagerank computations. In: Proceedings of the twelfth international world wide web conference, Budapest, Hungary, May 2003, pp 261–270
Kwong MK, Zettl A (1991) Norm inequalities for the powers of a matrix. Am Math Mon 98(6):533–538
Langville AN, Meyer CD (2003) Survey: deeper inside pagerank. Internet Math 1(3):335–380
Lee CP, Golub GH, Zenios SA (2007) A two-stage algorithm for computing pagerank and multistage generalizations. Internet Math 4 (4):299–327
McSherry F (2005) A uniform approach to accelerated pagerank computation. In: Proceedings of the 14th international conference on World Wide Web, Chiba, Japan, May 2005, pp 575–582
Osborne JRS, Wiggins E (2009) On accelerating the pagerank computation. Internet Math 6(2):157–172
Sidi A (2008) Methods for acceleration of convergence (extrapolation) of vector sequences. In: Wah BW (ed) Wiley encyclopedia of Computer Science and Engineering. Wiley, New York
SNAP (2007). Stanford Network Analysis Platform Standard Large Network Dataset Collection, Jure Leskovec. http://snap.stanford.edu/data/index.html
Wicks J, Greenwald AR (2007) More efficient parallel computation of pagerank. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, Amsterdam, The Netherlands, July 2007, pp 861–862
Wu G, Wei Y (2010) Arnoldi versus GMRES for computing pagerank: a theoretical contribution to google’s pagerank problem. ACM Trans Inf Syst 28(3):11:1–11:28
Xue GR, Yang Q, Zeng HJ, Yu Y, Chen Z (2005) Exploiting the hierarchical structure for link analysis. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, Salvador, Brazil, August 2005, pp 186–193
Zhu Y, Ye S, Li X (2005) Distributed pagerank computation based on iterative aggregation-disaggregation methods. In: ACM fourteenth conference on information and knowledge management (CIKM), Bremen, Germany, November 2005, pp 578–585
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
The proof of Theorem 1 is as follows.
Proof
From Theorem 3.1 in Achlioptas and McSherry [1], when \(E(\widetilde{A}_{ij})=A_{ij},\,Var(\widetilde{A}_{ij})\le \delta ^2,\hbox { and }|\widetilde{A}_{ij}-A_{ij}|\le \delta K\), where \(K=\left( \frac{\log (1+\epsilon )}{\log (2n)}\right) ^2\times \sqrt{2n}\), for any fixed \(\epsilon >0\). When we choose \(K>1\), then for any \(\omega >0\hbox { and }2n>152,\,\Vert \widetilde{A}-A\Vert _2 \le 2(1+\epsilon +\omega )\delta \sqrt{2n}\) holds w.h.p.
As \(|\widetilde{A}_{ij}-A_{ij}|\le \delta \le \delta K\) (choosing \(K>1\)), then \(\Vert \widetilde{A}-A\Vert _2 \le 2(1+\epsilon +\omega )\delta \sqrt{2n}\) holds w.h.p. for any \(\omega >0\hbox { and }n>76\).
Especially, for sparse transition matrix \(A\), the number of nonzero entries \(N=dn,\hbox { where }d\) is the average degree and \(d\le 50\) for most sparse real datasets; thus, \(\sqrt{2n}/\sqrt{N}\) is a constant. Let \(\eta =2(1+\epsilon +\omega )\sqrt{2n}/\sqrt{N},\hbox { then }\Vert \widetilde{A}-A\Vert _2 \le \eta \sqrt{N} \delta \) holds w.h.p. \(\square \)
The proof of Lemma 2 is as follows.
Proof
We can compute \(E(\widetilde{A}_{ij}^2)\) by Eq. 4 as follows
It is obvious that
The upper bound for the variance of \(\widetilde{A}_{ij}\) is given as
Let \(\delta =\Vert A\Vert _F/\sqrt{s}\). Then, \(|\widetilde{A}_{ij}-A_{ij}| \le \delta \theta < \delta \hbox { and }Var(\widetilde{A}_{ij}) \le \delta ^2\). According to Eq. 2 and Theorem 1, we have
holds w.h.p., where \(\eta \) is a small constant.
We prove Eq. 6 as follows. According to Eq. 12, we can derive the upper bound of \(\Vert \widetilde{A}\Vert _F\) as follows.
As \(E\left( \Vert \widetilde{A}\Vert _F\right) \le \sqrt{E\left( \Vert \widetilde{A}\Vert _F^2\right) } \le \sqrt{\sum _{ij}E\left( \widetilde{A}_{ij}^2\right) } \le \sqrt{s\frac{\Vert A\Vert _F^2}{s}}=\Vert A\Vert _F\). According to Chernoff bound, \(Pr\left[ \Vert \widetilde{A}\Vert _F\le \Vert A\Vert _F \right] \ge 1-\exp (-\Omega (\Vert A\Vert _F))\). Thus,
holds w.h.p.
Since \(\Vert \widetilde{A}\Vert _2 \le \Vert \widetilde{A}\Vert _F\), we have
\(\square \)
The proof of Theorem 3 is as follows.
Proof
According to Lemma 2, we have \(\Vert \widetilde{P}-P\Vert _2 \le \eta \alpha \Vert P\Vert _F ,\hbox { where }\alpha = \sqrt{\frac{N}{s}},\,\eta >0\) is a small constant; and \(\Vert \widetilde{P}\Vert _2 \le \Vert P\Vert _F\).
Let \(R_k =\widetilde{P}^k-P^k,\hbox { where }k\in \{1,\ldots ,K\}\), then,
From the above steps, we can see that \(\Vert R_k \Vert _2\) can be induced from \(\Vert R_{k-1} \Vert _2\). Next, we induce
via mathematical induction as follows.
Basis: Initially,
When \(k=2\), according to Eqs. 14 and 16,
Thus, Eq. 15 holds for the first step.
Inductive step: Assume that Eq. 15 holds for the \((k-1)\)th step,
Then, we deduce the \(k\)th step according to Eqs. 14 and 17 as follows.
that is, Eq. 15 also holds for the \(k\)th step.
Since both the basis and the inductive step have been performed, by mathematical induction, the statement Eq. 15 holds for all natural \(k\). As such, we have
where \(\alpha = \sqrt{\frac{N}{s}}\) and \(\eta >0\) is a small constant. \(\square \)
The proof of Theorem 4 is as follows.
Proof
Let \(B_{i}\) be the input matrix \(B\) at the \(i\)th iteration of Algorithm 8, where \(1 \le i \le k\). Then, \(B_{i}=B_{i-1}\widetilde{P}_{i},\hbox { where }\widetilde{P}_{i}\) is the sampled matrix of \(A\) with \(\alpha _{i}=\sqrt{\frac{N}{s}}\), where \(N\) is the number nonzero entries of \(P,\hbox { and }s\) is the sample size of \(\widetilde{P}_{i}\). According to Lemma 2, \(\Vert \widetilde{P}_{i}- P\Vert _2 \le \eta \alpha _{i}\Vert P\Vert _F\), we have
From Lemma 2, \(\Vert \widetilde{P}_i\Vert _F \le \Vert P\Vert _F\), thus,
Since \(\Vert B_1\Vert _2 = \Vert P\Vert _2 \le \Vert P\Vert _F \) and from Eq. 20, we have
Since the estimation of \(P^k\) from Algorithm 8 is \(\widetilde{P^k} = B_k = B_{k-1}\widetilde{P}_k\), the total error for estimating \(P^k\) in Algorithm 8 is given by \(\Vert B_k-P^k\Vert _2\). Note that \(B_0=I\) is the identity matrix. According to Eqs. 19 and 21, we have
If we adaptively choose \(\alpha _i=a\alpha _{i-1}\) and \(\alpha _1 = a\), then \(\alpha _i=a^i\), then we obtain the error bound as follows:
\(\square \)
The proof of Theorem 5 is as follows.
Proof
The total error of estimating \(\pi \) is represented by
where \(c\) is a constant, and \(v\) is constant vector.
According to error analysis of direct sampling of \(P^k\) from Theorem 3,
where \(\eta >0\) is small constant; thus, the total error of estimating \(\pi \) by the direct sampling method is proportional to
And since
we have
With certain choice of \(\alpha _i\hbox { for }1 \le i \le K\), according to error analysis of adaptive sampling of \(P^k\) in Theorem 4,
where \(\eta >0\) is small constant; thus, the total error of estimating \(\pi \) by the adaptive sampling method is proportional to
\(\square \)
Rights and permissions
About this article
Cite this article
Liu, W., Li, G. & Cheng, J. Fast PageRank approximation by adaptive sampling. Knowl Inf Syst 42, 127–146 (2015). https://doi.org/10.1007/s10115-013-0691-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-013-0691-1