Skip to main content

Efficient PageRank with Same Out-Link Groups

  • Conference paper
Information Retrieval Technology (AIRS 2004)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3411))

Included in the following conference series:

  • 414 Accesses

Abstract

Traditional PageRank algorithm suffers from heavy computation cost due to the huge number of web pages. In this paper, we propose a more efficient algorithm to compute the pagerank value for each web page directly on the same out-link groups. This new algorithm groups the pages with the same out-link behavior (SOLB) as a unit. It is proved that the derived PageRank is the same as that from the original PageRank algorithm which calculates over single webpage; while our proposed algorithm improve the efficiency greatly. For simplicity, we restrict the group within a directory and define metrics to measure the similarity of the pages in same out-link behavior. We design the experiments to group from 0.5 liked to exact SOLB pages; the results show that such group offers similar rank scores as traditional PageRank algorithm does and achieves a remarkable 50% on efficiency.

This work is done at Microsoft Research Asia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arasu, A.: PageRank Computation and the Structure of the Web: Experiments and Algorithms. In: 11th International WWW Conference (May 2002)

    Google Scholar 

  2. Medina, A., Matta, I., Byers, J.: On the Origin of Power Laws in Internet Topologies. ACM Computer Communication Review 30(2), 18–28 (2000)

    Article  Google Scholar 

  3. Golub, G.H., Van Loan, C.F.: Matrix Computations. The Johns Hopkins University Press, Baltimore (1996)

    MATH  Google Scholar 

  4. Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46, 604–632 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  5. Faloutsos, M., Faloutsos, P., Faloutsos, C.: On Power-Law Relationships of the Internet Topology. In: Proceedings of ACM SIGCOMM (August 1999)

    Google Scholar 

  6. Chen, Q., Chang, H., Govindan, R., et al.: The Origin of Power Laws in Internet Topologies Revisited. In: Proceedings of IEEE INFOCOM 2002 (2002)

    Google Scholar 

  7. Brin, S., Page, L., Motwami, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. Stanford University Technical Report (1998)

    Google Scholar 

  8. Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Exploiting the Block Structure of the Web for Computing. Stanford University Technical Report (2003)

    Google Scholar 

  9. Lu, Y., Zhang, B., Xi, W., Zhen, C., et al.: The PowerRank Web Link Analysis Algorithm. In: Proceedings of 13th International WWW Conference (May 2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lu, Y. et al. (2005). Efficient PageRank with Same Out-Link Groups. In: Myaeng, S.H., Zhou, M., Wong, KF., Zhang, HJ. (eds) Information Retrieval Technology. AIRS 2004. Lecture Notes in Computer Science, vol 3411. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31871-2_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-31871-2_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25065-4

  • Online ISBN: 978-3-540-31871-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics