A qualitative study of large-scale recommendation algorithms for biomedical knowledge bases

Noei, Ehsan; Hayat, Tsahi; Perrie, Jessica; Çolak, Recep; Hao, Yanqi; Vembu, Shankar; Lyons, Kelly; Molyneux, Sam

doi:10.1007/s00799-021-00300-3

A qualitative study of large-scale recommendation algorithms for biomedical knowledge bases

Published: 19 April 2021

Volume 22, pages 197–215, (2021)
Cite this article

International Journal on Digital Libraries Aims and scope Submit manuscript

306 Accesses
1 Citation
2 Altmetric
Explore all metrics

Abstract

The frequency at which new research documents are being published causes challenges for researchers who increasingly need access to relevant documents in order to conduct their research. Searching across a variety of databases and browsing millions of documents to find semantically relevant material is a time-consuming task. Recently, there has been a focus on recommendation algorithms that suggest relevant documents based on the current interests of the researchers. In this paper, we describe the implementation of seven commonly used algorithms and three aggregation algorithms. We evaluate the recommendation algorithms in a large-scale biomedical knowledge base with the goal of identifying relative weaknesses and strengths of each algorithm. We analyze the recommendations from each algorithm based on assessments of output as evaluated by 14 biomedical researchers. The results of our research provide unique insights into the performance of recommendation algorithms against the needs of modern-day biomedical researchers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Overview of the Main Recommendation Approaches for the Scientific Articles

Research-paper recommender systems: a literature survey

Article 26 July 2015

Joeran Beel, Bela Gipp, … Corinna Breitinger

A scientometric review of emerging trends and new developments in recommendation systems

Article 01 May 2015

Meen Chul Kim & Chaomei Chen

References

Acharya, A.: Follow related research for key authors, October 13, 2017. https://scholar.googleblog.com/2017/10/follow-related-research-for-key-authors.html. Last accessed 4 Dec 2017
Aggarwal, C.C., et al.: Recommender Systems, vol. 1. Springer (2016)
Agmon, S.: The relaxation method for linear inequalities. Can. J. Math. 6, 382–392 (1954)
Article MathSciNet MATH Google Scholar
AI2: Leverage AI to combat information overload (2017). http://allenai.org/semantic-scholar/. Last accessed 11 Sept 2017
Ailon, N., Charikar, M., Newman, A.: Aggregating inconsistent information: ranking and clustering. J. ACM 55(5), 1–27 (2008)
Article MathSciNet MATH Google Scholar
Ali, A., Meilă, M.: Experiments with Kemeny ranking: what works when? Math. Soc. Sci. 64, 28–40 (2012)
Article MathSciNet MATH Google Scholar
Apache: Introduction to item-based recommendations with hadoop (2019). http://mahout.apache.org/users/recommender/intro-itembased-hadoop.html/. Last accessed 21 Feb 2019
Bartholdi, J., III., Tovey, C., Trick, M.: Voting schemes for which it is can be difficult to tell who won the election. Soc. Choice Welf. 6, 157–165 (1989)
Article MathSciNet MATH Google Scholar
Beel, J., Gipp, B., Langer, S., Breitinger, C.: paper recommender systems: a literature survey. Int. J. Digit. Libr. 17(4), 305–338 (2016)
Article Google Scholar
Beel, J., Langer, S.: A comparison of offline evaluations, online evaluations, and user studies in the context of research-paper recommender systems. In: International Conference on Theory and Practice of Digital Libraries, pp. 153–168. Springer (2015)
Beel, J., Langer, S., Genzmehr, M., Gipp, B., Breitinger, C., Nürnberger, A.: Research paper recommender system evaluation: a quantitative literature survey. In: Proceedings of the International Workshop on Reproducibility and Replication in Recommender Systems Evaluation, Ser. RepSys ’13. New York, NY, USA, pp. 15–22. ACM (2013)
Beel, J., Langer, S., Gipp, B., Nürnberger, A.: The architecture and datasets of Docear’s research paper recommender system. D-Lib Mag. 20(11), 1 (2014)
Google Scholar
Bergstrom, C.T., West, J.D., Wiseman, M.A.: The eigenfactor metrics. Int. J. Neurosci. 28(45), 11 33-11 434 (2008)
Google Scholar
Bobadilla, J., Ortega, F., Hernando, A., Gutiérrez, A.: Recommender systems survey. Knowl.-Based Syst. 46, 109–132 (2013)
Article Google Scholar
Bodenreider, O., Nelson, S.J., Hole, W.T., Chang, H.F.: Beyond synonymy: exploiting the UMLS semantics in mapping vocabularies. In: Proceedings of the AMIA Symposium, p. 815. American Medical Informatics Association (1998)
Bollacker, K.D., Lawrence, S., Giles, C.L.: CiteSeer: an autonomous web agent for automatic retrieval and identification of interesting publications. In: Proceedings of the 2nd International Conference on Autonomous Agents, pp. 116–123. ACM (1998)
Box, G., Hunter, W., Hunter, J.: Statistics for Experimenters. Wiley (1978)
Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F.: Extensible markup language (xml) 1.0 (2000)
Breese, J.S., Heckerman, D., Kadie, C.M.: Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pp. 43–52 (1998)
Campos, D., Matos, S., Oliveira, J.L.: A modular framework for biomedical concept recognition. BMC Bioinform. 14(1), 281 (2013)
Article Google Scholar
Cañamares, R., Castells, P., Moffat, A.: Offline evaluation options for recommender systems. Inf. Retr. J. 23, 1–24 (2020)
Google Scholar
Canese, K., Weis, S.: PubMed: the bibliographic database. The NCBI Handbook (2013). http://www.ncbi.nlm.nih.gov/books/NBK153385/. Last accessed 15 Dec 2017
Cision: Acquisition of the Thomson Reuters intellectual property and science business by Onex and Baring Asia completed (2016). http://www.prnewswire.com/. Last accessed 15 Dec 2017
Clarivate, Web of Science: Core collection help (2017). https://images.webofknowledge.com/images/help/WOS/hp_full_record.html. Last accessed 15 Jan 2019
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press (2009)
Crossref: Crossref (2019). http://www.crossref.org/
de Borda, J.-C.: Mémoire sur les élections au scrutin, Histoire de l’Académie Royale des Sciences, Paris, pp. 657–664 (1781)
Ding, Y., Zhang, G., Chambers, T., Song, M., Wang, X., Zhai, C.: Content-based citation analysis: the next generation of citation analysis. J. Assoc. Inf. Sci. Technol. 65(9), 1820–1833 (2014)
Article Google Scholar
Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: Proceedings of the 10th International Conference on World Wide Web, pp. 613–622. ACM (2001)
Ekstrand, M.D., Kannan, P., Stemper, J.A., Butler, J.T., Konstan, J.A., Riedl, J.T.: Automatically building research reading lists. In: Proceedings of the Fourth ACM Conference on Recommender Systems, pp. 159–166. ACM (2010)
Elsevier: The largest up-to-date collection of global, unbiased and expertly sourced research (2017). https://www.elsevier.com/solutions/scopus/content. Last accessed 2018 Dec 15
Fafalios, P., Tzitzikas, Y.: Stochastic reranking of biomedical search results based on extracted entities. J. Assoc. Inf. Sci. Technol. 68(11), 2572–2586 (2017)
Article Google Scholar
Falagas, M.E., Pitsouni, E.I., Malietzis, G.A., Pappas, G.: Comparison of PubMed, Scopus, Web of Science, and Google Scholar: strengths and weaknesses. J. Fed. Am. Soc. Exp. Biol. 22(2), 338–342 (2008)
Google Scholar
Ge, M., Delgado-Battenfeld, C., Jannach, D.: Beyond accuracy: evaluating recommender systems by coverage and serendipity. In: Proceedings of the Fourth ACM Conference on Recommender Systems, pp. 257–260 (2010)
Gipp, B., Beel, J.: Citation proximity analysis (CPA): a new approach for identifying related work based on co-citation analysis. In: ISSI’09: 12th International Conference on Scientometrics and Informetrics, pp. 571–575 (2009)
Gomaa, W.H., Fahmy, A.A.: A survey of text similarity approaches. Int. J. Comput. Appl. 68(13), 13–18 (2013)
Google Scholar
Google: Google scholar: about (2020). https://scholar.google.ca/intl/en/scholar/about.html
Greenhalgh, T.: How to read a paper: the medline database. BMJ 315(7101), 180–183 (1997)
Article Google Scholar
Gruson, A., Chandar, P., Charbuillet, C., McInerney, J., Hansen, S., Tardieu, D., Carterette, B.: Offline evaluation to make decisions about playlistrecommendation algorithms. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 420–428 (2019)
Hakenberg, J., Plake, C., Leaman, R., Schroeder, M., Gonzalez, G.: Inter-species normalization of gene mentions with GNAT. Bioinformatics 24(16), i126–i132 (2008)
Article Google Scholar
Herlocker, J.L., Konstan, J.A., Terveen, L.G., Riedl, J.T.: Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. 22(1), 5–53 (2004)
Article Google Scholar
Ho, T. K.: Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition, vol. 1, pp. 278–282. IEEE (1995)
Hotho, A., Jäschke, R., Schmitz, C., Stumme, G.: Information retrieval in folksonomies: search and ranking. In: European Semantic Web Conference, pp. 411–426. Springer (2006)
Huang, Y., Contractor, N., Yao, Y.: CI-KNOW: recommendation based on social networks. In: Proceedings of the International Conference on Digital Government Research, pp. 27–33. Digital Government Society of North America (2008)
Ishida, Y., Shimizu, T., Yoshikawa, M.: An analysis and comparison of keyword recommendation methods for scientific data. Int. J. Digit. Libr. 21, 1–21 (2020)
Article Google Scholar
Jack, K.: Mendeley: crowdsourcing and recommending research on a large scale (2011). http://www.slideshare.net/KrisJack/mendeley-crowdsourcing-and-recommending-research-on-a-large-scale. Accessed 2015-02-25
Jack, K.: Mahout becomes a researcher: large scale recommendations at Mendeley (2012). http://www.slideshare.net/KrisJack/mahout-becomes-a-researcher-large-scale-recommendations-at-mendeley. Last accessed 15 Dec2017
Jack, K.: Mendeley: recommendation systems for academic literature (2012). http://www.slideshare.net/KrisJack/mendeley-recommendation-systems-for-academic-literature. Last accessed 15 Dec 2017
Jannach, D., Zanker, M., Felfernig, A., Friedrich, G.: An Introduction to Recommender Systems. Cambridge, New York (2011)
Google Scholar
Jolliffe, I.: Principal Component Analysis. Springer (2011)
Jones, N.: AI science search engines expand their reach, November 11, 2016. http://www.nature.com/news/ai-science-search-engines-expand-their-reach-1.20964. Last accessed 15 Dec 2017
Kaminskas, M., Bridge, D.: Diversity, serendipity, novelty, and coverage: a survey and empirical analysis of beyond-accuracy objectives in recommender systems. ACM Trans. Interact. Intell. Syst. 7(1), 2 (2017)
Article Google Scholar
Kemeny, J., Snell, J.: Mathematical Models in Social Sciences. Blaisdell, New York (1962)
MATH Google Scholar
Kessler, M.M.: Bibliographic coupling between scientific papers. Am. Doc. 14(1), 10–25 (1963)
Article Google Scholar
Klavans, R., Boyack, K.W.: Which type of citation analysis generates the most accurate taxonomy of scientific and technical knowledge? J. Assoc. Inf. Sci. Technol. 68(4), 984–998 (2017)
Article Google Scholar
Konstan, J.A., McNee, S.M., Ziegler, C.-N., Torres, R., Kapoor, N., Riedl, J.: Lessons on applying automated recommender systems to information-seeking tasks. AAAI 6, 1630–1633 (2006)
Google Scholar
Kotkov, D., Wang, S., Veijalainen, J.: A survey of serendipity in recommender systems. Knowl.-Based Syst. 111, 180–192 (2016)
Article Google Scholar
Kreisman, R.: Thomson Reuters-Google Scholar linkage offers big win for STM users and publishers (2013)
Krishnan, V., Narayanashetty, P.K., Nathan, M., Davies, R.T., Konstan, J.A.: Who predicts better? results from an online study comparing humans and an online recommender system. In: Proceedings of the 2008 ACM Conference on Recommender Systems, pp. 211–218 (2008)
Küçüktunç, O., Saule, E., Kaya, K., Çatalyürek, Ü.V.: Towards a personalized, scalable, and exploratory academic recommendation service. In: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 636–641. ACM (2013)
Kunaver, M., Požrl, T.: Diversity in recommender systems—a survey. Knowl.-Based Syst. 123, 154–162 (2017)
Article Google Scholar
Lawrence, S., Giles, C.L., Bollacker, K.: Digital libraries and autonomous citation indexing. IEEE Comput. 32(6), 67–71 (1999)
Article Google Scholar
Leaman, R., Doğan, R.I., Lu, Z.: DNorm: disease name normalization with pairwise learning to rank. Bioinformatics 29(22), 2909–2917 (2013)
Article Google Scholar
Lee, B.-H., Kim, H.-N., Jung, J.-G., Jo, G.-S.: Location-based service with context data for a restaurant recommendation. In: International Conference on Database and Expert Systems Applications, pp. 430–438. Springer (2006)
Li, C.-L., Su, Y.-C., Lin, T.-W., Tsai, C.-H., Chang, W.-C., Huang, K.-H., Kuo, T.-M., Lin, S.-W., Lin, Y.-S., Lu, Y.-C. et al.: Combination of feature engineering and ranking models for paper-author identification in KDD cup 2013. In: Proceedings of the 2013 KDD Cup Workshop, p. 2. ACM (2013)
Liu, J., Lei, K.H., Liu, J.Y., Wang, C., Han, J.: Ranking-based name matching for author disambiguation in bibliographic data. In: Proceedings of the 2013 KDD Cup Workshop, p. 8. ACM (2013)
Ma, Z., Pant, G., Sheng, O.R.L.: Interest-based personalized search. ACM Trans. Inf. Syst. 25(1), 5 (2007)
Article Google Scholar
Manning, C.D., Raghavan, P., Schütze, H.: Scoring, term weighting and the vector space model. Introd. Inf. Retr. 100, 2–4 (2008)
Google Scholar
Marshakova-Shaikevich, I.: System of document connections based on references. Sci. Tech. Inf. Ser. VINITI 6, 3–8 (1973)
Google Scholar
McNee, S.M., Cosley, Istvan, D., Gopalkrishnan, P., Lam, S.K., Rashid, A.M., Konstan, J.A., Riedl, J.: On the recommending of citations for research papers. In: Proceedings of the 2002 ACM Conference on Computer Supported Cooperative Work (2002)
McNee, S.M., Riedl, J., Konstan, J.A.: Being accurate is not enough: how accuracy metrics have hurt recommender systems. In: CHI’06 Extended Abstracts on Human Factors in Computing Systems, pp. 1097–1101 (2006)
Meta: Meta (2020). https://www.meta.org/
Middleton, S.E., Shadbolt, N.R., De Roure, D.C.: Ontological user profiling in recommender systems. ACM Trans. Inf. Syst. 22(1), 54–88 (2004)
Article Google Scholar
Mogenet, A., Pham, T.A.N., Kazama, M., Kong, J.: Predicting online performance of job recommender systems with offline evaluation. In: Proceedings of the 13th ACM Conference on Recommender Systems, pp. 477–480 (2019)
Molyneux, S.D., Molyneux, A.C.: System and method for establishing a dynamic meta-knowledge network. US Patent 9,613,321. (Apr. 4 2017)
Moskovitch, R., Wang, F., Pei, J., Friedman, C.: JASIST special issue on biomedical information retrieval. J. Assoc. Inf. Sci. Technol. 68(11), 2525–2528 (2017)
Article Google Scholar
Nelson, S.J.: Medical terminologies that work: the example of MeSH. In: Proceedings of the 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks (ISPAN), pp. 380–384. IEEE (2009)
Newman, M.E.: The structure of scientific collaboration networks. Proc. Natl. Acad. Sci. 98(2), 404–409 (2001)
Article MathSciNet MATH Google Scholar
Noei, E., Heydarnoori, A.: Exaf: a search engine for sample applications of object-oriented framework-provided concepts. Inf. Softw. Technol. 75, 135–147 (2016)
Article Google Scholar
Noei, E., Zhang, F., Wang, S., Zou, Y.: Towards prioritizing user-related issue reports of mobile applications. Empir. Softw. Eng. 24, 1–33 (2018)
Google Scholar
Plume, A., van Weijen, D.: Publish or perish? The rise of the fractional author. Res. Trends 38(3), 16–18 (2014)
Google Scholar
PubMed Help, November 27, 2017. http://www.ncbi.nlm.nih.gov/books/NBK3827/. Last accessed 15 Dec 2017
Raamkumar, A.S., Foo, S., Pang, N.: Can i have more of these please? Assisting researchers in finding similar research papers from a seed basket of papers. Electron. Libr. 36(3), 568–587 (2018)
Article Google Scholar
Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press (2011)
Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J.: GroupLens: an open architecture for collaborative filtering of netnews. In: Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work. New York, NY, USA, pp. 175–186. ACM (1994)
Said, A., Fields, B., Jain, B.J., Albayrak, S.: User-centric evaluation of a k-furthest neighbor collaborative filtering recommender algorithm. In: Proceedings of the 2013 Conference on Computer Supported Cooperative Work, pp. 1399–1408 (2013)
Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th International Conference on World Wide Web, pp. 285–295. ACM (2001)
Schalekamp, F., Zuylen, A.: Rank aggregation: together we are strong. In: Proceedings of the 11th Workshop on Algorithm Engineering and Experiments, pp. 38–51 (1998)
Schein, A.I., Popescul, A., Ungar, L.H., Pennock, D.M.: Methods and metrics for cold-start recommendations. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 253–260 (2002)
Scott, A.J., Knott, M.: A cluster analysis method for grouping means in the analysis of variance. Biometrics 30, 507–512 (1974)
Article MATH Google Scholar
Semantic Scholar: Semantic scholar (2019). https://www.semanticscholar.org/. Last accessed 15 Jan 2019
Shani, G., Gunawardana, A.: Evaluating recommendation systems. In: Recommender Systems Handbook, pp. 257–297. Springer (2011)
Shvachko, K., Kuang, H., Radia, S., Chansler, R., et al.: The hadoop distributed file system. MSST 10, 1–10 (2010)
Google Scholar
Small, H.: Co-citation in the scientific literature: a new measure of the relationship between two documents. J. Am. Soc. Inf. Sci. 24(4), 265–269 (1973)
Article Google Scholar
Smyth, B., McClave, P.: Similarity vs. diversity. In: International Conference on Case-Based Reasoning, pp. 347–361. Springer (2001)
Sugiyama, K., Kan, M.-Y.: Serendipitous recommendation for scholarly papers considering relations among researchers. In: Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries, pp. 307–310. ACM (2011)
Sugiyama, K., Kan, M.-Y.: A comprehensive evaluation of scholarly paper recommendation using potential citation papers. Int. J. Digit. Libr. 16(2), 91–109 (2015)
Article Google Scholar
Tan, P.-N.: Introduction to Data Mining. Pearson Education India (2018)
Testa, J.: The Thomson Reuters journal selection process (2016). http://scientific.thomsonreuters.com/wok/benefits/essays/journalselection/. Last accessed 15 Dec 2017
Zar, J.H.: Significance testing of the spearman rank correlation coefficient. J. Am. Stat. Assoc. 67(339), 578–580 (1972)
Article MATH Google Scholar

Download references

Acknowledgements

This research was supported in part by a Natural Sciences and Engineering Research Council (NSERC) Engage Grant and an NSERC Strategic Partnership Project Grant. The authors wish to recognize contributions of Bahar Ghadiri Bashardoost and Yuyang Liu.

Author information

Authors and Affiliations

University of Toronto, Toronto, Canada
Ehsan Noei, Tsahi Hayat, Jessica Perrie & Kelly Lyons
Meta, Toronto, Canada
Recep Çolak, Yanqi Hao & Shankar Vembu
The Chan Zuckerberg Initiative, Redwood City, USA
Sam Molyneux

Authors

Ehsan Noei
View author publications
You can also search for this author in PubMed Google Scholar
Tsahi Hayat
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Perrie
View author publications
You can also search for this author in PubMed Google Scholar
Recep Çolak
View author publications
You can also search for this author in PubMed Google Scholar
Yanqi Hao
View author publications
You can also search for this author in PubMed Google Scholar
Shankar Vembu
View author publications
You can also search for this author in PubMed Google Scholar
Kelly Lyons
View author publications
You can also search for this author in PubMed Google Scholar
Sam Molyneux
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ehsan Noei.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Noei, E., Hayat, T., Perrie, J. et al. A qualitative study of large-scale recommendation algorithms for biomedical knowledge bases. Int J Digit Libr 22, 197–215 (2021). https://doi.org/10.1007/s00799-021-00300-3

Download citation

Received: 16 May 2020
Revised: 15 March 2021
Accepted: 02 April 2021
Published: 19 April 2021
Issue Date: June 2021
DOI: https://doi.org/10.1007/s00799-021-00300-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A qualitative study of large-scale recommendation algorithms for biomedical knowledge bases

Abstract

Access this article

Similar content being viewed by others

Overview of the Main Recommendation Approaches for the Scientific Articles

Research-paper recommender systems: a literature survey

A scientometric review of emerging trends and new developments in recommendation systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A qualitative study of large-scale recommendation algorithms for biomedical knowledge bases

Abstract

Access this article

Similar content being viewed by others

Overview of the Main Recommendation Approaches for the Scientific Articles

Research-paper recommender systems: a literature survey

A scientometric review of emerging trends and new developments in recommendation systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation