Abstract
Research results manifest in large corpora of patents and scientific papers. However, both corpora lack a consistent taxonomy and references across different document types are sparse. Therefore, and because of contrastive, domain-specific language, recommending similar papers for a given patent (or vice versa) is challenging.
We propose a recommender system that leverages topic distributions and keywords to recommend related work despite these challenges. As a case study, we evaluate our approach on patents and papers of two fields: medical and computer science. We find that topic-based recommenders complement word-based recommenders for documents with collection-specific language and increase mean average precision by up to 27%. As a result of our work, publications from both corpora form a joint digital library, which connects academia and industry.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
Larger keyword vectors increase runtime but do not improve result quality.
- 4.
- 5.
parameters set as suggested in the original paper: \(\beta =0.01\), \(\delta =0.01\), \(\gamma _1=1\), \(\gamma _2=1\).
- 6.
- 7.
References
Bornmann, L., Mutz, R.: Growth rates of modern science: a bibliometric analysis based on the number of publications and cited references. J. Assoc. Inf. Sci. Technol. (JASIST) 66(11), 2215–2222 (2015)
Gao, S., Luo, H., Chen, D., Li, S., Gallinari, P., Guo, J.: Cross-domain recommendation via cluster-level latent factor model. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS, vol. 8189, pp. 161–176. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40991-2_11
Glänzel, W., Meyer, M.: Patents cited in the scientific literature: an exploratory study of ‘reverse’ citation relations. Scientometrics 58(2), 415–428 (2003)
Krestel, R., Smyth, P.: Recommending patents based on latent topics. In: Proceedings of the Conference on Recommender Systems (RecSys), pp. 395–398. ACM (2013)
Liu, Y., Niculescu-Mizil, A., Gryc, W.: Topic-link LDA: joint models of topic and author community. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 665–672. ACM (2009)
Mayr, P., Mutschke, P., Petras, V.: Reducing semantic complexity in distributed digital libraries: treatment of term vagueness and document re-ranking. Libr. Rev. 57(3), 213–224 (2008)
Mei, Q., Cai, D., Zhang, D., Zhai, C.: Topic modeling with network regularization. In: Proceedings of the International Conference on World Wide Web (WWW), pp. 101–110. ACM (2008)
Momeni, F., Mayr, P.: Using co-authorship networks for author name disambiguation. In: Proceedings of the Joint Conference on Digital Libraries, pp. 261–262. ACM (2016)
Paul, M., Girju, R.: Cross-cultural analysis of blogs and forums with mixed-collection topic models. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1408–1417. ACL (2009)
Wang, B., Liu, S., Ding, K., Liu, Z., Xu, J.: Identifying technological topics and institution-topic distribution probability for patent competitive intelligence analysis: a case study in LTE technology. Scientometrics 101(1), 685–704 (2014)
Wang, C., Blei, D.M.: Collaborative topic modeling for recommending scientific articles. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 448–456. ACM (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Risch, J., Krestel, R. (2017). What Should I Cite? Cross-Collection Reference Recommendation of Patents and Papers. In: Kamps, J., Tsakonas, G., Manolopoulos, Y., Iliadis, L., Karydis, I. (eds) Research and Advanced Technology for Digital Libraries. TPDL 2017. Lecture Notes in Computer Science(), vol 10450. Springer, Cham. https://doi.org/10.1007/978-3-319-67008-9_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-67008-9_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67007-2
Online ISBN: 978-3-319-67008-9
eBook Packages: Computer ScienceComputer Science (R0)