Abstract
Expert finding is of vital importance for exploring scientific collaborations to increase productivity by sharing and transferring knowledge within and across different research areas. Expert finding methods, including content-based methods, link structure-based methods, and a combination of content-based and link structure-based methods, have been studied in recent years. However, most state-of-the-art expert finding approaches have usually studied candidates’ personal information (e.g. topic relevance and citation counts) and network information (e.g. citation relationship) separately, causing some potential experts to be ignored. In this paper, we propose a topical and weighted factor graph model that simultaneously combines all the possible information in a unified way. In addition, we also design the Loopy Max-Product algorithm and related message-passing schedules to perform approximate inference on our cycle-containing factor graph model. Information Retrieval is chosen as the test field to identify representative authors for different topics within this area. Finally, we compare our approach with three baseline methods in terms of topic sensitivity, coverage rate of SIGIR PC (e.g. Program Committees or Program Chairs) members, and Normalized Discounted Cumulated Gain scores for different rankings on each topic. The experimental results demonstrate that our factor graph-based model can definitely enhance the expert-finding performance.
Similar content being viewed by others
References
Balog, K., Azzopardi, L., & Rijke, M. D. (2006). Formal models for expert finding in enterprise corpora. Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, Seattle, WA (pp. 43–50).
Balog, K., Azzopardi, L., & Rijke, M. D. (2009). A language modeling framework for expert finding. Information Processing and Management, 45(1), 1–19.
Bishop, C. M. (2006). Pattern recognition and machine learning (pp. 359–419). New York: Springer Publications.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1033.
Bornmann, L., & Daniel, H. (2008). What do citation counts measure? A review of studies on citing behavior. Journal of Documentation, 64(1), 45–80.
Campbell, C., Maglio, P., Cozzi, A., & Dom, B. (2003). Expertise identification using email communications. Proceedings of 12th international conference on information and knowledge management, New Orleans (pp. 528–531).
Chen, P., Xie, H., Maslov, S., & Redner, S. (2007). Finding scientific gems with Google. Journal of Informetrics, 1(1), 8–15.
Ding, Y. (2011). Topic-based PageRank on author co-citation networks. Journal of the American Society for Information Science and Technology, 62(3), 449–466.
Ding, Y., & Cronin, B. (2010). Popular and/or prestigious? Measures of scholarly esteem. Information Processing and Management, 47(1), 80–96.
Ding, Y., Yan, E., Frazho, A., & Caverlee, J. (2010). PageRank for ranking authors in co-citation networks. Journal of the American Society for Information Science and Technology, 60(11), 2229–2243.
Fiala, D., Rousselot, F., & Ježek, K. (2008). PageRank for bibliographic networks. Scientometrics, 76(1), 135–158.
Fu, Y., Xiang, R., Liu, Y., Zhang, M., & Ma, S. (2007). A CDD-based formal model for expert finding. Proceedings of the sixteenth association for computing machinery conference on conference on information and knowledge management, Lisbon (pp. 881–884).
Gross, P. L. K., & Gross, E. M. (1927). College libraries and chemical education. Science, 66(1713), 385–389.
Hettich, S., & Pazzani, M. J. (2006). Mining for proposal reviewers: lessons learned at the national science foundation. Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, Philadelphia (pp. 862–871).
Hofmann, T. (1999). Probabilistic latent semantic indexing. Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, Berkeley (pp. 50–57).
Jarvelin, K., & Kekalainen, J. (2002). Cumulated gain-based evaluation of IR techniques. Association for Computing Machinery Transactions on Information Systems, 20(4), 422–446.
Jiao, J., Yan, J., Zhao, H., & Fan, W. (2009). ExpertRank: An expert user ranking algorithm in online communities. Proceedings of the 2009 international conference on new trends in information and service science, Beijing (pp. 674–679).
Jurczyk, P., & Agichtein, E. (2007). Hits on question answer portals: Exploration of link analysis for Author Ranking. Proceedings of the 30th annual international association for computing machinery SIGIR conference on research and development in information retrieval, Amsterdam (pp. 845–846).
Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. Journal of the Association for Computing Machinery, 46(5), 604–632.
Kochen, M. (1978). Models of scientific output. In Y. Elkana, J. Lederberg, R. K. Merton, A. Thackray, & H. Zuckerman (Eds.), Toward a metric of science: The advent of science indicators (pp. 97–138). New York: Wiley.
Kschischang, F. R., Frey, B. J., & Loeliger, H. (2001). Factor graphs and the sum–product algorithm. Institute of Electrical and Electronics Engineers Transactions on Information Theory, 47(2), 498–519.
Kullback, S., Burnham, K. P., Laubscher, N. F., Dallal, G. E., et al. (1987). Letter to the editor: The Kullback–Leibler distance. The American Statistician, 41(4), 338–341.
Liu, X., Bollen, J., Nelson, M. L., & Sompel, H. V. (2005). Co-authorship networks in the digital library research community. Information Processing and Management, 41(6), 1462–1480.
Matutinovic, S. F. (2007). Citation analysis for five Serbian authors in Web of Science, Scopus and Google Scholar. INFOTHECA—Journal of Informatics and Librarianship, 8(1/2), 25–34.
Mimno, D., & McCallum, A. (2007). Expertise modeling for matching papers with reviewers. Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, San Jose, CA (pp. 500–509).
Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation ranking: Bringing order to the Web. Technical Report, Stanford InfoLab, 1999-0120.
Petkova, D., & Croft, W. B. (2006). Hierarchical language models for expert finding in enterprise corpora. Proceedings of the 18th institute of electrical and electronics engineers international conference on tools with artificial intelligence, Washington, DC (pp. 599–608).
Rosen-Zvi, M., Griffiths, T., Steyvers, M., & Smyth, P. (2004). The author-topic model for authors and documents. Proceedings of the 20th conference on uncertainty in artificial intelligence, Banff (pp. 487–494).
Serdyukov, P., Henning, R., & Hiemstra, D. (2008). Modeling multi-step relevance propagation for expert finding. Proceedings of the 17th association for computing machinery conference on information and knowledge management, Napa Valley, CA (pp. 1133–1142).
Sidiropoulos, A., & Manolopoulos, Y. (2006). A generalized comparison of graph-based ranking algorithms for publications and authors. Journal of Systems and Software, 79(12), 1679–1700.
Smirnova, E., & Balog, K. (2011). A user-oriented model for expert finding. Proceedings of the 33rd European conference on advances in information retrieval, Dublin (pp. 580–592).
Smith, L. (1981). Citation analysis. Library Trends, 30(1), 83–106.
Tang, J., Jin, R., & Zhang, J. (2008). A topic modeling approach and its integration into the random walk framework for academic search. Proceedings of 2008 institute of electrical and electronics engineers international conference on data mining, Pisa (pp. 1055–1060).
Tang, J., Sun, J., Wang, C., & Yang, Z. (2009). Social influence analysis in large-scale networks. Proceedings of the 15th association for computing machinery SIGKDD international conference on knowledge discovery and data mining, Paris (pp. 807–816).
Wu, H., Pei, Y., & Yu, J. (2009). Hidden topic analysis based formal framework for finding experts in metadata corpus. Proceedings of the eighth institute of electrical and electronics engineers/ACIS international conference on computer and information science, Phoenix (pp. 369–374).
Yan, E., & Ding, Y. (2011). Discovering author impact: A PageRank perspective. Information Processing and Management, 47(1), 125–134.
Zhang, J., Tang J., & Li J. (2007). Expert finding in a social network. Advances in databases: Concepts, systems and applications, Lecture Notes in Computer Science 4443 (pp. 1066–1069).
Acknowledgments
This work was supported by the following Grants: (i) Grant No. 20090094110015 from Research Fund for the Doctoral Program of Higher Education of China, (ii) Grant No. BK2008354 and No. BK2010520 from the Natural Science Foundation of Jiangsu Province of China, (iii) Grant No. 2008135 from the “Six Talent Peaks Program” of Jiangsu Province of China.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Table A1
Comparison of existing expert finding approaches (DOCX 23 kb)
Table A2
Top 10 words associated with each topic (DOCX 19 kb)
Table A3
Top 10 authors for 5 different topics based on our approach and baseline methods (DOCX 25 kb)
Fig. A1
Loopy Max-Product algorithm with serial schedule using random sequences (DOCX 21 kb)
Rights and permissions
About this article
Cite this article
Lin, L., Xu, Z., Ding, Y. et al. Finding topic-level experts in scholarly networks. Scientometrics 97, 797–819 (2013). https://doi.org/10.1007/s11192-013-0988-6
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-013-0988-6