Skip to main content

Method for Expert Search Using Topical Similarity of Documents

  • Conference paper
  • First Online:
Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2019)

Abstract

The article describes the problem of finding and selecting experts for reviewing grant applications, proposals and scientific papers. The main shortcomings of the methods that are currently used to solve this problem were analyzed. These shortcomings can be eliminated by analyzing large collections of sci-tech documents, the authors of which are potential experts on various topics. The article describes a method that forms a ranked list of experts for a given document using a search for documents that are similar in topic. To evaluate the proposed method, we used a collection of grant applications from a science foundation. The proposed method is compared with the method based on topic modeling. Experimental studies show that in terms of such metrics as recall, MAP and NDCG, the proposed method is slightly better. In conclusion, the current limitations of the proposed method are discussed.

The research is supported by Russian Foundation for Basic Research (grant №18-29-03087) The reported research is also partially funded by the project “Text mining tools for big data” as a part of the program supporting Technical Leadership Centers of the National Technological Initiative “Center for Big Data Storage and Processing” at the Moscow State University (Agreement with Fund supporting the NTI-projects No. 13/1251/2018 11.12.2018).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/bigartm/bigartm version 0.10.0.

  2. 2.

    https://github.com/rare-technologies/gensim version 3.8.1.

References

  1. The Russian Scientific Foundation held a meeting of the expert council on scientific projects. http://rscf.ru/ru/node/2367. Accessed 02 Mar 2020

  2. Zubarev, D.V., Devyatkin, D.A., Sochenkov, I.V., Tikhomirov, I.A., Grigoriev, O.G.: Expert assignment method based on similar document retrieval from large text collections. In: CEUR Workshop Proceedings of the Data Analytics and Management in Data Intensive Domains: Selected Papers of the XXI International Conference on Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2019), vol. 2523, pp. 266–278 (2019)

    Google Scholar 

  3. Dumais, S.T., Jakob N.: Automating the assignment of submitted manuscripts to reviewers. In: Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 233–244. ACM (1992)

    Google Scholar 

  4. Balog, K., Azzopardi, L., De Rijke, M.: Formal models for expert finding in enterprise corpora. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 43–50, ACM (2006)

    Google Scholar 

  5. Kalmukov, Y., Boris, R.: Comparative analysis of existing methods and algorithms for automatic assignment of reviewers to papers (2010). arXiv preprint https://arxiv.org/pdf/1012.2019.pdf. Accessed 11 May 2019

  6. Pesenhofer, A., Mayer, R., Rauber, A.: Improving scientific conferences by enhancing conference management systems with information mining capabilities. In: 2006 1st International Conference on Digital Information Management, pp. 359–366. IEEE (2006)

    Google Scholar 

  7. Ferilli, S., Di Mauro, N., Basile, T.M.A., Esposito, F., Biba, M.: Automatic topics identification for reviewer assignment. In: Ali, M., Dapoigny, R. (eds.) IEA/AIE 2006. LNCS (LNAI), vol. 4031, pp. 721–730. Springer, Heidelberg (2006). https://doi.org/10.1007/11779568_78

    Chapter  Google Scholar 

  8. Rodriguez, M.A., Bollen, J.: An algorithm to determine peer-reviewers. In: Proceedings of the 17th ACM conference on Information and knowledge management, pp. 319–328. ACM (2008)

    Google Scholar 

  9. Li, X., Watanabe, T.: Automatic paper-to-reviewer assignment, based on the matching degree of the reviewers. Procedia Comput. Sci. 22, 633–642 (2013)

    Article  Google Scholar 

  10. Peng, H., Hu, H., Wang, K., Wang, X.: Time-aware and topic-based reviewer assignment. In: Bao, Z., Trajcevski, G., Chang, L., Hua, W. (eds.) DASFAA 2017. LNCS, vol. 10179, pp. 145–157. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55705-2_11

    Chapter  Google Scholar 

  11. Cifariello, P., Ferragina, P., Ponza, M.: Wiser: a semantic approach for expert finding in academia based on entity linking. Inf. Syst. 82, 1–16 (2019)

    Article  Google Scholar 

  12. Berendsen, R., et al.: On the assessment of expertise profiles. J. Am. Soc. Inf. Sci. Technol. 64(10), 2024–2044 (2013)

    Article  Google Scholar 

  13. Sochenkov, I.V., Zubarev, D.V., Tihomirov, I.A.: Exploratory patent search. Inform. Appl. 12(1), 89–94 (2018)

    Google Scholar 

  14. Osipov, G., et al. Relational-situational method for intelligent search and analysis of scientific publications. In: Proceedings of the Integrating IR Technologies for Professional Search Workshop, pp. 57–64 (2013)

    Google Scholar 

  15. Shelmanov, A.O., Smirnov, I.V.: Methods for semantic role labeling of Russian texts. In: Proceedings of International Conference Dialog on Computational Linguistics and Intellectual Technologies, vol. 13, no. 20, pp. 607–620 (2014)

    Google Scholar 

  16. Shvets, A., et al.: Detection of current research directions based on full-text clustering. In: 2015 Science and Information Conference (SAI), pp. 483–488. IEEE (2015)

    Google Scholar 

  17. Li, L., Wang, L., Zhang, Y.: A comprehensive survey of evaluation metrics in paper-reviewer assignment. In: Computer Science and Applications: Proceedings of the 2014 Asia-Pacific Conference on Computer Science and Applications (CSAC 2014), Shanghai, China, 27–28 December 2014, p. 281. CRC Press (2014)

    Google Scholar 

  18. Lin, S., Hong, W., Wang, D., Li, T.: A survey on expert finding techniques. J. Intell. Inf. Syst. 49(2), 255–279 (2017). https://doi.org/10.1007/s10844-016-0440-5

    Article  Google Scholar 

  19. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3(Jan), 993–1022 (2003)

    MATH  Google Scholar 

  20. Vorontsov, K., Anna, P.: Additive regularization of topic models. Mach. Learn. 101(1–3), 303–323 (2015)

    Article  MathSciNet  Google Scholar 

  21. Vorontsov, K., Frei, O., Apishev, M., Romov, P., Dudarenko, M.: BigARTM: open source library for regularized multimodal topic modeling of large collections. In: Khachay, MYu., Konstantinova, N., Panchenko, A., Ignatov, D.I., Labunets, V.G. (eds.) AIST 2015. CCIS, vol. 542, pp. 370–381. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26123-2_36

    Chapter  Google Scholar 

  22. Rehurek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Denis Zubarev .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zubarev, D., Devyatkin, D., Sochenkov, I., Tikhomirov, I., Grigoriev, O. (2020). Method for Expert Search Using Topical Similarity of Documents. In: Elizarov, A., Novikov, B., Stupnikov, S. (eds) Data Analytics and Management in Data Intensive Domains. DAMDID/RCDL 2019. Communications in Computer and Information Science, vol 1223. Springer, Cham. https://doi.org/10.1007/978-3-030-51913-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-51913-1_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-51912-4

  • Online ISBN: 978-3-030-51913-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics