Abstract
Author metadata provide significant scientific publication characterization, which often represents important domain knowledge. Publications from existing or potential reputable authors motivate further research as “stand on the shoulder of giants”. This paper addresses author ranking problem for information retrieval and recommendation, and the contributions of this research are four-fold. First of all, we employed full-text citation analysis (citation context) to enhance the classical author citation network. Second, supervised topic modeling method is used to determine the contribution of a specific author (as a vertex) or a citation (as an edge). Third, PageRank with prior and transitioning topical probability distributions measured the importance of authors (in the graph) based on each scientific topic. Last but not least, we proposed a novel evaluation method to compare the result of PageRank with prior with classical ranking methods, i.e., BM25, TFIDF and Language Model, and PageRank. The result shows that our ranking method with full-text citation analysis significantly (p<0.001) outperforms than the other ranking methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Franceschet, M.: PageRank: standing on the shoulders of giants. Communications of the ACM 54(6), 92–101 (2011)
Garfield, E.: Citation analysis as a tool in journal evaluation: journals can be ranked by frequency and impact of citations for science policy studies. Science 178, 471–479 (1972)
Garfield, E., Sher, I.H.: Genetics Citation Index. Institute for Scientific Information, Philadelphia (1963)
Hirsch, J.E.: An index to quantify an individual’s scientific research output. Proc. Natl. Acad. Sci. USA 102(46), 16569–16572 (2005)
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998)
Herlach, G.: Can retrieval of information from citation indexes be simplified? Multiple mention of a reference as a characteristic of the link between cited and citing article. Journal of the American Society for Information Science 29(6), 308–310 (1978)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. The Journal of Machine Learning Research 3, 993–1022 (2003)
Ramage, D., Hall, D., Nallapati, R., Manning, C.D.: Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In: EMNLP 2009 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 248–256. Association for Computational Linguistics (2009)
White, S., Smyth, P.: Algorithms for estimating relative importance in networks. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 266–275. ACM (2003)
Cheng, A., Friedman, E.: Manipulability of PageRank under Sybil strategies. In: First Workshop on the Economics of Networked Systems, NetEcon 2006 (2006)
Rodriguez, M.A., Bollen, J.: Simulating network influence algorithms using particle-swarms: Pagerank and pagerank-priors (2006)
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) 20(4), 422–446 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, J., Guo, C., Liu, X. (2012). Topic Based Author Ranking with Full-Text Citation Analysis. In: Hou, Y., Nie, JY., Sun, L., Wang, B., Zhang, P. (eds) Information Retrieval Technology. AIRS 2012. Lecture Notes in Computer Science, vol 7675. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35341-3_43
Download citation
DOI: https://doi.org/10.1007/978-3-642-35341-3_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35340-6
Online ISBN: 978-3-642-35341-3
eBook Packages: Computer ScienceComputer Science (R0)