ABSTRACT
Mainstream link-based static-rank algorithms (e.g. PageRank and its variants) express the importance of a page as the linear combination of its in-links and compute page importance scores by solving a linear system in an iterative way. Such linear algorithms, however, may give apparently unreasonable static-rank results for some link structures. In this paper, we examine the static-rank computation problem from the viewpoint of evidence combination and build a probabilistic model for it. Based on the model, we argue that a nonlinear formula should be adopted, due to the correlation or dependence between links. We focus on examining some simple formulas which only consider the correlation between links in the same domain. Experiments conducted on 100 million web pages (with multiple static-rank quality evaluation metrics) show that higher quality static-rank could be yielded by the new nonlinear algorithms. The convergence of the new algorithms is also proved in this paper by nonlinear functional analysis.
- R. Baeza-Yates, E. Davis. Web Page Ranking using Link Attributes. In WWW 2004. Google ScholarDigital Library
- P. Berkhin. A Survey on PageRank Computing. Internet Mathematics, 2(1):73--120, 2005.Google ScholarCross Ref
- K. Bharat, B.-W. Chang, M. Henzinger and M. Ruhl. Who Links to Whom: Mining Linkage between Web Site. In ICDM 2001. Google ScholarDigital Library
- M. Bianchini, M. Gori and F. Scarselli. Inside PageRank. ACM Transactions on Internet Technology, 5(1):92--128, 2005. Google ScholarDigital Library
- P. Boldi, M. Santini and S. Vigna. PageRank as a Function of the Damping Factor. In WWW 2005. Google ScholarDigital Library
- Z. Gyongyi, H. Garcie-Molina and J. Pedersen. Combating Web Spam with TrustRank. In VLDB 2004. Google ScholarDigital Library
- T. H. Haveliwala. Topic-Sensitive PageRank. In WWW 2002. Google ScholarDigital Library
- K. Jarvelin and J. Kekalainen. IR evaluation Methods for Retrieving Highly Relevant Documents. In SIGIR 2000. Google ScholarDigital Library
- S. Kamvar, T. Haveliwala, C. Manning and G. Golub. Extrapolation Methods for Accelerating the Computation of PageRank. In WWW 2003. Google ScholarDigital Library
- S. Kamvar, T. Haveliwala, C. Manning and G. Golub. Exploiting the Block Structure of the Web for Computing PageRank. Technical Report, Stanford University, 2003.Google Scholar
- M. G. Kendall. Rank Correlation Methods, 4th edition. Griffin, London, 1970.Google Scholar
- J. M. Kleinberg. Authoritative Sources in a Hyperlinked Environment. In Proceedings of ACM-SLAM Symposium on Discrete Algorithms, 1998. Google ScholarDigital Library
- M.A. Krasnoselskii and P.P. Zabreiko. Geometric Methods in Nonlinear Analysis, Springer-Verlag, Berlin, 1984.Google ScholarCross Ref
- A. Y. Ng, A. X. Zheng and M. I. Jordan. Stable Algorithms for Link Analysis. In SIGIR 2001. Google ScholarDigital Library
- L. Page, S. Brin, R. Motwani and T. Winograd. The PageRank Citation Ranking: Bring Order to the Web. Technical report, Stanford University Database Group, 1998.Google Scholar
- M. Richardson and P. Domingos. The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank. In NIPS 2002.Google Scholar
- S. E. Robertson. Overview of Okapi Projects. Journal of Documentation, 53(1):3--7, 1997.Google ScholarCross Ref
- S. Shi, R. Song, and J.-R. Wen. Latent Additivity: Combining Homogeneous Evidence. Technical report, MSR-TR-2006-110, Microsoft Research, August 2006.Google Scholar
- P. Tsaparas. Using Non-Linear Dynamical Systems for Web Searching and Ranking. PODS, 2004. Google ScholarDigital Library
- B. Wu and B. D. Davison. Identifying Link Farm Spam Pages. In WWW 2005. Google ScholarDigital Library
- G.-R. Xue, Q. Yang, H.-J. Zeng, Y. Yu and Z. Chen. Exploiting the Hierarchical Structure for Link Analysis. In SIGIR, 2005. Google ScholarDigital Library
- E. Yilmaz, J. Aslam and S. Robertson. A New Rank Correlation Coefficient for Information Retrieval. In Proc. of the 31st Annual International ACM SIGIR Conference. July 20--24, 2008, Singapore. Google ScholarDigital Library
- H. Zhang, M. Zhu, S. Shi, and J.-R. Wen. Employing Topic Models for Pattern-based Semantic Class Discovery. In Proc. of the Annual Meeting of the Association for Computational Linguistics (ACL'09), Singapore, August 2009. Google ScholarDigital Library
Index Terms
- Nonlinear static-rank computation
Recommendations
Decentralised stabilisation for nonlinear time delay interconnected systems using static output feedback
In this paper, time delay interconnected systems are considered where the nominal isolated subsystems are fully nonlinear. The interconnections and the matched and mismatched disturbances are nonlinear and time-delayed. A decentralised static output ...
Enhanced nonlinear damping for a class of singularly perturbed interconnected nonlinear systems
In this paper, we propose a method of enhanced nonlinear damping control for a class of singularly perturbed interconnected nonlinear systems (SPINSs). Instead of simply canceling out the interconnection between slow and fast subsystems, the proposed ...
Decentralised robust sliding mode control for a class of nonlinear interconnected systems by static output feedback
In this paper, a class of nonlinear interconnected systems with nonlinear nominal subsystems is considered. Matched and mismatched uncertainties are both dealt with. Based on sliding mode techniques, a decentralised robust control scheme, using only ...
Comments