Skip to main content
Log in

Truncated and Sparse Power Methods with Partially Updating for Large and Sparse Higher-Order PageRank Problems

  • Published:
Journal of Scientific Computing Aims and scope Submit manuscript

Abstract

Higher-order Markov chain plays an important role in analyzing a variety of stochastic processes over time in multi-dimensional spaces. One of the important applications is the study of higher-order PageRank problem. The truncated power method proposed recently provides us with another idea to solve this problem, however, its accuracy and efficiency can be poor in practical computations. In this work, we revisit the higher-order PageRank problem and consider how to solve it efficiently. The contribution of this work is as follows. First, we accelerate the truncated power method for higher-order PageRank. In the improved version, it is neither to form and store the vectors arising from the dangling states, nor to store an auxiliary matrix. Second, we propose a truncated power method with partially updating to further release the overhead, in which one only needs to update some important columns of the approximation in each iteration. However, the truncated power method solves a modified higher-order PageRank model for convenience, which is not mathematically equivalent to the original one. Thus, the third contribution of this work is to propose a sparse power method with partially updating for the original higher-order PageRank problem. The convergence of all the proposed methods are discussed. Numerical experiments on large and sparse real-world and synthetic data sets are performed. Numerical results demonstrate the superiority of our new algorithms over some state-of-the-art ones for large and sparse higher-order PageRank problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data availability

Enquiries about data availability should be directed to the authors.

Notes

  1. \(^{a}\) We mention that there is a typo in Algorithm 2 of [7], where \(\textbf{G}(:,j)\) should be replaced with \(\textbf{G}(j,:)^{\varvec{T}}\); see (3.3).

  2. We thank Dr. Weiyang Ding for providing us with the data set and the MATLAB codes of the truncated power method.

References

  1. Ching, W., Fung, E., Ng, M.K.: Higher-order Markov chain models for categorical data sequences. Naval Res. Logist. 51, 557–574 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  2. Ching, W., Ng, M.K.: Markov Chains: Models. Springer, Algorithms and Applications (2006)

    MATH  Google Scholar 

  3. Ching, W., Ng, M.K., Fung, E.: Higher-oreder multivariate Markov chains and their applications. Linear Algebra Appl. 428, 492–507 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  4. Chung, K.: Elementary Probability Theory with Stochastic Processes. Springer, New York (1979)

    Book  MATH  Google Scholar 

  5. Ciarlet, P.: Linear and Nonlinear Functional Analysis with Applications. SIAM, Philadelphia (2013)

    MATH  Google Scholar 

  6. Cipolla, S., Redivo-Zaglia, M., Tudisco, F.: Extrapolation methods for fixed-point Multilinear PageRank computations. Numer. Linear Algebra Appl. 27, e2280 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  7. Ding, W., Ng, M.K., Wei, Y.: Fast computation of stationary joint probability distribution of sparse Markov chains. Appl. Numer. Math. 125, 68–85 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  8. Gleich, D.: PageRank beyond the web. SIAM Rev. 57, 321–363 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  9. Gleich, D., Lim, L., Yu, Y.: Multilinear PageRank. SIAM J. Matrix Anal. Appl. 36, 1507–1541 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  10. Guo, P.: A residual-based error bound for the multilinear PageRank vector. Linear Multilinear Algebra 68, 568–574 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  11. Guo, P., Gao, S., Guo, X.: A modified Newton method for multilinear PageRank. Taiwan. J. Math. 22, 1161–1171 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  12. Huang, J., Wu, G.: Convergence of the fixed-point iteration for multilinear PageRank. Numer. Linear Algebra Appl. 28, e2379 (2021)

    Article  MATH  Google Scholar 

  13. Kuntz, J., Thomas, P., Stan, G., Barahona, M.: Stationary distributions of continuous-time Markov chains: a review of theory and truncation-based approximations. SIAM Rev. 63, 3–64 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  14. Langville, A., Meyer, C.: Google’s PageRank and Beyond: The Science of Search Engine Rankings, Princeton university press, (2006)

  15. Li, W., Liu, D., Ng, M.K., Vong, S.: The uniqueness of multilinear PageRank vectors. Numer. Linear Algebra Appl. 24, e2107 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  16. Li, W., Liu, D., Vong, S., Xiao, M.: Multilinear PageRank: uniqueness, error bound and perturbation analysis. Appl. Numer. Math. 156, 584–607 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  17. Li, W., Liu, D., Vong, S.: Perron vector analysis for irreducible nonnegative tensors and its applications. J. Ind. Manag. Optim. 17, 29–50 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  18. Li, W., Ng, M.K.: On the limiting probability distribution of a transition probability tensor. Linear Multilinear Algebra 62, 362–385 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  19. Li, X., Ng, M.K.: Solving sparse non-negative tensor equations: algorithms and applications. Front. Math. China 10, 649–680 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  20. Li, X., Ng, M.K., Ye, Y.: Har: hub, authority and relevance scores in multi-relational data for query search, In: Proceedings of the 2012 SIAM International Conference on Data Mining, (2012), pp. 141–152

  21. Liu, D., Li, W., Vong, S.: Relaxation methods for solving the tensor equation arising from the higher-order Markov chains. Numer. Linear Algebra Appl. 26, e2260 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  22. Meini, B., Poloni, F.: Perron-based algorithms for the multilinear PageRank. Numer. Linear Algebra Appl. 25, e2177 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  23. Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web, Stanford Digital Libraries Working Paper, (1999)

  24. Raftery, A.: A model for high-order Markov chains. J. R. Statal Soc. 47, 528–539 (1985)

    MathSciNet  MATH  Google Scholar 

  25. Stewart, W.: Probability, Markov Chains, Queues, and Simulation: The Mathematical Basis of Performance Modeling, Princeton University Press, (2009)

  26. Voorhees, E., Harman, D.: TREC: Experiment and Evaluation in Information Retrieval, vol. 1, MIT Press, (2005)

  27. Wu, G., Wang, Y., Jin, X.: A preconditioned and shifted GMRES algorithm for the PageRank problem with multiple damping factors. SIAM J. Sci. Comput. 34, A2558–A2575 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  28. Wu, Y., Bian, Y., Zhang, X.: Remember where you came from: on the second-order random walk based proximity measures. Proc. VLDB Endow. 10, 13–24 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

We would like to express our sincere thanks to our editor and the anonymous referees for insightful comments and suggestions that greatly improved the representation of this paper.

Funding

Funding was provided by National Natural Science Foundation of China (Grant No. 12271518)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gang Wu.

Ethics declarations

Competing interests

There are no conflicts of interest to this work. Our manuscript has no associated data.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is supported by the National Natural Science Foundation of China under grant 12271518, the Key Research and Development Project of Xuzhou Natural Science Foundation under grant KC22288, and the Open Project of Key Laboratory of Data Science and Intelligence Education of the Ministry of Education under grant DSIE202203.

Appendices

Appendix A: Numerical Results of Subsection 6.2 on Real-World Data Set

See Tables 3, 4, 5 and 6.

Table 3 A comparison of the six algorithms for high-order PageRank on .GOV Web collection, \(\alpha =0.85\), and the parameter \(\beta \) for sparsing is chosen as \(\frac{1}{n^{2}},\frac{1}{n^{3}}\) and \(\frac{1}{n^{4}}\), respectively. Notice that the approximation obtained from the power method is generally dense
Table 4 A comparison of the six algorithms for high-order PageRank on .GOV Web collection, \(\alpha =0.90\), and the parameter \(\beta \) for sparsing is chosen as \(\frac{1}{n^{2}},\frac{1}{n^{3}}\) and \(\frac{1}{n^{4}}\), respectively. Notice that the approximation obtained from the power method is generally dense
Table 5 A comparison of the six algorithms for high-order PageRank on .GOV Web collection, \(\alpha =0.95\), and the parameter \(\beta \) for sparsing is chosen as \(\frac{1}{n^{2}},\frac{1}{n^{3}}\) and \(\frac{1}{n^{4}}\), respectively.Notice that the approximation obtained from the power method is generally dense
Table 6 A comparison of the six algorithms for high-order PageRank on .GOV Web collection, \(\alpha =0.99\), and the parameter \(\beta \) for sparsing is chosen as \(\frac{1}{n^{2}},\frac{1}{n^{3}}\) and \(\frac{1}{n^{4}}\), respectively. Notice that the approximation obtained from the power method is generally dense
Table 7 The number of top-ten ranking results of the .GOV Web collection obtained from the five algorithms, \(\alpha =0.85\)

Appendix B: Numerical Results of Subsection 6.3 on Synthetic Data Set

See Tables 8, 9, 10 and 11.

Table 8 A comparison of the algorithms on synthetic data, \(n=10000,\alpha = 0.85\). Notice that the approximations obtained from the power method to EIOM are generally dense
Table 9 A comparison of the algorithms on synthetic data, \(n=10000,\alpha = 0.90\). Notice that the approximations obtained from the power method to EIOM are generally dense
Table 10 A comparison of the algorithms on synthetic data, \(n=10000,\alpha = 0.95\). Notice that the approximations obtained from the power method to EIOM are generally dense
Table 11 A comparison of the algorithms on synthetic data, \(n=10000,\alpha = 0.99\). Notice that the approximations obtained from the power method to EIOM are generally dense

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, J., Wu, G. Truncated and Sparse Power Methods with Partially Updating for Large and Sparse Higher-Order PageRank Problems. J Sci Comput 95, 34 (2023). https://doi.org/10.1007/s10915-023-02146-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10915-023-02146-0

Keywords

Mathematics Subject Classifications

Navigation