Skip to main content
Log in

Eigenvectors of directed graphs and importance scores: dominance, T-Rank, and sink remedies

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

We study the properties of the principal eigenvector for the adjacency matrix (and related matrices) for a general directed graph. In particular—motivated by the use of the eigenvector for estimating the “importance” of the nodes in the graph—we focus on the distribution of positive weight in this eigenvector, and give a coherent picture which builds upon and unites earlier results. We also propose a simple method—“T-Rank”—for generating importance scores. T-Rank generates authority scores via a one-level, non-normalized matrix, and is thus distinct from known methods such as PageRank (normalized), HITS (two-level), and SALSA (two-level and normalized). We show, using our understanding of the principal eigenvector, that T-Rank has a much less severe “sink problem” than does PageRank. Also, we offer numerical results which quantify the “tightly-knit community” or TKC effect. We find that T-Rank has a stronger TKC effect than PageRank, and we offer a novel interpolation method which allows for continuous tuning of the strength of this TKC effect. Finally, we propose two new “sink remedies”, i.e., methods for ensuring that the principal eigenvector is positive everywhere. One of our sink remedies (source pumping) is unique among sink remedies, in that it gives a positive eigenvector without rendering the graph strongly connected. We offer a preliminary evaluation of the effects and possible applications of these new sink remedies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Adamic LA, Glance N (2005) The political blogosphere and the 2004 US election: divided they blog. In: LinkKDD’05: proceedings of the 3rd international workshop on Link discovery. ACM, New York, pp 36–43

  • Arasu A, Novak J, Tomkins A, Tomlin J (2002) PageRank computation and the structure of the Web: experiments and algorithms. In: Proceedings of the 11th international world wide web conference

  • Avrachenkov K, Litvak N, Pham KS (2007) Distribution of pagerank mass among principle components of the web. In: Workshop on algorithms and models for the web-graph (WAW2007). San Diego, December 11–12

  • Baeza-Yates R, Saint-Jean F, Castillo C (2002) Web structure, dynamics and page quality. In: String processing and information retrieval, vol 2476, Lecture Notes in Computer Science. Springer, pp 117–130

  • Berkhin P (2005) A survey on pagerank computing. Internet Math 2(1): 73–120

    MATH  MathSciNet  Google Scholar 

  • Berman A, Plemmons RJ (1979) Nonnegative matrices in the mathematical sciences. Academic Press, New York

    MATH  Google Scholar 

  • Berman A, Shaked-Monderer N (2009) Encyclopedia of complexity and systems science. chapter Nonnegative Matrices and Digraphs. Springer

  • Bianchini M, Gori M, Scarselli F (2005) Inside pagerank. ACM Trans Inter Tech 5(1): 92–128

    Article  Google Scholar 

  • Bjelland J, Canright GS, Engø-Monsen K (2008) Web link analysis: estimating a document’s importance from its context. Telektronikk 1: 95–113

    Google Scholar 

  • Bjelland J, Canright G, Engø-Monsen K (2009) Encyclopedia of complexity and systems science, chapter Link Analysis and Web Search. Springer

  • Boldi P, Vigna S (2004) The webgraph framework I: compression techniques. In: Proceedings of the 13th international world wide web conference. ACM Press, pp 595–601

  • Boldi P, Santini M, Vigna S (2005) Pagerank as a function of the damping factor. In: WWW ’05: proceedings of the 14th international conference on world wide web. ACM, New York, pp 557–566

  • Broder A, Kumar R, Maghoul F, Raghavan P, Stata R (2000) Graph structure in the web. In: Proceedings of the 9th international world wide web conference, pp 247–256

  • Ding C, He X, Husbands P, Zha H, Simon HD (2002) Pagerank, hits and a unified framework for link analysis. In: SIGIR ’02: proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, New York, pp 353–354

  • Donato D, Leonardi S, Millozzi S, Tsaparas P (2005) Mining the inner structure of the web graph. In: Proceeding of the 8th international workshop on the web and databases, pp 145–150

  • Ebel H, Mielsch LI, Bornholdt S (2002) Scale-free topology of e-mail networks. Phys Rev E 66(3): 035103

    Article  Google Scholar 

  • Farkas Illés J, Derényi I, Barabási A, Vicsek T (2001) Spectra of real-world graphs: beyond the semicircle law. Phys Rev E 64(2): 026704

    Article  Google Scholar 

  • Gantmacher FR (1959) The theory of matrices, vol 2. Chelsea, New York

    Google Scholar 

  • Gleich D (2006) MatlabBGL. Stanford University Institute for Computational and Mathematical Engineering

  • Goh K-I, Kahng B, Kim D (2001) Spectra and eigenvectors of scale-free networks. Phys Rev E 64(5): 051903

    Article  Google Scholar 

  • Gospodnetic O, Hatcher E (2004) Lucene in action. Manning Publications, Greenwich

    Google Scholar 

  • Harary F, Norman RZ, Cartwright D (1965) Structural models: an introduction to the theory of directed graphs. Wiley, New York

    MATH  Google Scholar 

  • Hirai J, Raghavan S, Garcia-Molina H, Paepcke A (2000) WebBase: a repository of Web pages. Comput Netw (Amsterdam, Netherlands: 1999) 33(1–6): 277–293

    Google Scholar 

  • Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM 46(5): 604–632

    Article  MATH  MathSciNet  Google Scholar 

  • Langville AN, Meyer CD (2004) Deeper inside pagerank. Internet Math 1(3): 335–400

    MATH  MathSciNet  Google Scholar 

  • Langville AN, Meyer CD (2005) A survey of eigenvector methods for web information retrieval. SIAM Rev 47(1): 135–161

    Article  MATH  MathSciNet  Google Scholar 

  • Langville A, Meyer C (2006) Google’s pageRank and beyond: the science of search engine rankings. Princeton University Press, Princeton

    MATH  Google Scholar 

  • Lempel R, Moran S (2001) Salsa: the stochastic approach for link-structure analysis. ACM Trans Inf Syst 19(2): 131–160

    Article  Google Scholar 

  • Meila M, Pentney W (2007) Clustering by weighted cuts in directed graphs. In: SDM, SIAM

  • Motwani R, Raghavan P (1995) Randomized algorithms. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  • Ng AY, Zheng AX, Jordan MI (2001a) Stable algorithms for link analysis. In: SIGIR ’01: proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval. ACM, New York, pp 258–266

  • Ng AY, Zheng AX, Jordan MI (2001b) Link analysis, eigenvectors and stability. In: IJCAI, pp 903–910

  • Page L, Brin S, Motwani R, Winograd T (1998) The pagerank citation ranking: bringing order to the web. Technical report, Stanford Digital Library Technologies Project

  • Rothblum UG (1975) Algebraic eigenspaces of nonnegative matrices. Linear Algebra Appl 12: 281–292

    Article  MATH  MathSciNet  Google Scholar 

  • Tarjan R (1972) Depth-first search and linear graph algorithms. SICOMP 1(2): 146–160

    MATH  MathSciNet  Google Scholar 

  • Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4): 395–416

    Article  MathSciNet  Google Scholar 

  • Victory HD Jr (1985) On nonnegative solutions of matrix equations. SIAM J Algebraic Discret Methods 6(3): 406–412

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Burgess.

Additional information

Responsible editor: R. Bayardo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bjelland, J., Burgess, M., Canright, G. et al. Eigenvectors of directed graphs and importance scores: dominance, T-Rank, and sink remedies. Data Min Knowl Disc 20, 98–151 (2010). https://doi.org/10.1007/s10618-009-0154-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-009-0154-1

Keywords

Navigation