Skip to main content
Log in

Enhancing citation recommendation using citation network embedding

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Automatic recommendation of citations has been a focal point of research in scholarly digital libraries. Many graph-based citation recommendation algorithms have been proposed; however, most of them utilize local citation behavior from the citation network that results in recommending papers in the same proximity as the query article. In this paper, we propose to capture the global citation behavior in the citation network and use it to enhance the citation recommendation performance. Specifically, we develop a novel citation network embedding algorithm, ConvCN, to encode the citation relationship among papers. We then propose to enhance existing graph-based citation recommendation algorithms by incorporating ConvCN to improve the recommendation efficacy. ConvCN has been shown to improve the citation recommendation performance by 44.86% and 34.87% on average in terms of Bpref and F-measure@20, respectively. The findings from this research not only confirm that global citation behavior could be additionally useful for improving the performance of traditional citation recommendation algorithms but also shed light on the possibility to adapt the proposed ConvCN algorithm for other recommendation tasks that rely on graph-like information such as items recommendation in social networks and people recommendation in referral networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. http://www.scholar.google.com.

  2. http://www.academic.microsoft.com.

  3. http://www.citeseerx.ist.psu.edu.

  4. http://www.ieeexplore.ieee.org.

  5. http://www.dl.acm.org.

  6. https://www.wordnet-rdf.princeton.edu/lemma/luck.

  7. http://www.citeulike.org.

  8. https://www.aminer.cn.

  9. https://www.pytorch.org.

  10. https://www.github.com/shenweichen/GraphEmbedding.

References

  • Agrawal, A., George, R. A., Ravi, S. S., Kamath, S., & Kumar, A. (2019). Ars_nitk at mediqa 2019: Analysing various methods for natural language inference, recognising question entailment and medical question answering system. In Proceedings of the 18th BioNLP workshop and shared task (pp. 533–540).

  • Ali, Z., Qi, G., Muhammad, K., Ali, B., & Abro, W. A. (2020). Paper recommendation based on heterogeneous network embedding. Knowledge-Based Systems, 210, 106438.

    Article  Google Scholar 

  • Ali, Z., Qi, G., Muhammad, K., Kefalas, P., & Khusro, S. (2021). Global citation recommendation employing generative adversarial network. Expert Systems with Applications, 180, 114888.

    Article  Google Scholar 

  • Amjad, T., Daud, A., Che, D., & Akram, A. (2016). Muice: Mutual influence and citation exclusivity author rank. Information Processing & Management (pp. 374–386).

  • Bhagavatula, C., Feldman, S., Power, R., & Ammar, W. (2018a). Content-based citation recommendation. In Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: Human language technologies, Volume 1 (Long Papers) (pp. 238–251). New Orleans, Louisiana. Association for Computational Linguistics. https://doi.org/10.18653/v1/N18-1022. URL https://aclanthology.org/N18-1022.

  • Bhagavatula, C., Feldman, S., Power, R., & Ammar, W. (2018b). Content-based citation recommendation. CoRR.

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dallocation. Journal of machine Learning research (pp. 993–1022).

  • Bordes, A., Usunier, N., García-Durán, A., Weston, J., & Yakhnenko, O. (2013). Translating embeddings for modeling multi-relational data. In NIPS (pp. 2787–2795).

  • Bramsen, P., Deshpande, P., Lee, Y. K., & Barzilay, R. (2006). Inducing temporal graphs. In Proceedings of the 2006 conference on empirical methods in natural language processing (pp. 189–198).

  • Cai, H., Zheng, V. W., & Chang, K.C.-C. (2018). A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Transactions on Knowledge and Data Engineering, 30(9), 1616–1637.

    Article  Google Scholar 

  • Caragea, C., Silvescu, A., Mitra, P., & Giles, C. L. (2013). Can’t see the forest for the trees? A citation recommendation system. In Proceedings of the 13th ACM/IEEE-CS joint conference on digital libraries (pp. 111–114).

  • Chakraborty, T., Modani, N., Narayanam, R., & Nagar, S. (2015). Discern: A diversified citation recommendation system for scientific queries. In 2015 IEEE 31st international conference on data engineering (pp. 555–566).

  • Chen, J., & Zhuge, H. (2014). Summarization of scientific documents by detecting common facts in citations. Future Generation Computer Systems (pp. 246–252).

  • Chen, E., Tang, X., & Fu, B. (2018). A modified pedestrian retrieval method based on faster r-cnn with integration of pedestrian detection and re-identification. In 2018 International conference on audio, language and image processing (ICALIP) (pp. 63–66). IEEE.

  • Chen, X., Zhao, H.-J., Zhao, S., Chen, J., & Zhang, Y.-P. (2019). Citation recommendation based on citation tendency. Scientometrics (pp. 937–956).

  • Choi, J., Kim, T., & Lee, S.-G. (2018). Element-wise bilinear interaction for sentence matching. In Proceedings of the seventh joint conference on lexical and computational semantics (pp. 107–112).

  • Cohan, A., Feldman, S., Beltagy, I., Downey, D., & Weld, D. S. (2020). Specter: Document-level representation learning using citation-informed transformers. In Proceedings of the 58th annual meeting of the association for computational linguistics (ACL 2020).

  • Dai, T., Zhu, L., Wang, Y., & Carley, K. M. (2020). Attentive stacked denoising autoencoder with bi-lstm for personalized context-aware citation recommendation. IEEE/ACM Transactions on Audio, Speech, and Language Processing (pp. 553–568).

  • Dettmers, T., Pasquale, M., Pontus, S., & Riedel, S. (2018). Convolutional 2d knowledge graph embeddings. In Proceedings of the 32th AAAI conference on artificial intelligence (pp. 1811–1818).

  • Eto, M. (2019). Extended co-citation search: Graph-based document retrieval on a co-citation network containing citation context information. Information Processing & Management.

  • Fiala, D. (2010). Mining citation information from citeseer data. Scientometrics (pp. 553–562).

  • Frost, C. O. (1979). The use of citations in literary research: A preliminary classification of citation functions. The Library Quarterly (pp. 399–414).

  • Gao, Y., Wu, Q., & Zhu, L. (2020). Merging the citations received by arxiv-deposited e-prints and their corresponding published journal articles: Problems and perspectives. Information Processing & Management.

  • Gipp, B. (2014). Citation-based plagiarism detection. In Citation-based plagiarism detection (pp. 57–88).

  • Gori, M., & Pucci, A. (2006). Research paper recommender systems: A random-walk based approach. In 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI’06) (pp. 778–781).

  • Grover, A., & Leskovec, J. (2016). Node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 855-864).

  • Hamid, I., Wu, Yu., Nawaz, Q., & Zhao, R. (2018). A fast heuristic detection algorithm for visualizing structure of large community. Journal of Computational Science, 25, 280–288.

    Article  Google Scholar 

  • Haruna, K., Ismail, M. A., Qazi, A., Kakudi, H. A., Hassan, M., Muaz, S. A., & Chiroma, H. (2020). Research paper recommender system based on public contextual metadata. Scientometrics, 125(1), 101–114.

    Article  Google Scholar 

  • He, Q., Chen, B., Pei, J., Qiu, B., Mitra, P., & Giles, L. (2009). Detecting topic evolution in scientific literature: How can citations help? In Proceedings of the 18th ACM conference on information and knowledge management (pp. 957–966).

  • He, Q., Pei, J., Kifer, D., Mitra, P., & Giles, L. (2010). Context-aware citation recommendation. In Proceedings of the 19th international conference on world wide web (pp. 421–430). New York, NY, USA. Association for Computing Machinery.

  • Huang, W., Kataria, S., Caragea, C., Mitra, P., Giles, C. L., & Rokach, L. (2012). Recommending citations: Translating papers into references. In Proceedings of the 21st ACM international conference on information and knowledge management (pp. 1910–1914).

  • Huang, W., Wu, Z., Liang, C., Mitra, P., & Giles, C L. (2015). A neural probabilistic model for context based citation recommendation. In Twenty-ninth AAAI conference on artificial intelligence.

  • Huang, W., Wu, Z., Mitra, P., & Giles, C L. (2014). Refseer: A citation recommendation system. In IEEE/ACM joint conference on digital libraries (pp. 371–374). IEEE.

  • Jeong, C., Jang, S., Park, E., & Choi, S. (2020). A context-aware citation recommendation model with bert and graph convolutional networks. Scientometrics, 124(3), 1907–1922.

    Article  Google Scholar 

  • Jia, H., & Saule, E. (2017). An analysis of citation recommender systems: Beyond the obvious. In Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017 (pp. 216–223).

  • Jia, H., & Saule, E. (2018a). Local is good: A fast citation recommendation approach. In P. Gabriella, P. Benjamin, A. Leif, & H. Allan (Eds.), Advances in information retrieval (pp. 758–764).

  • Jia, H., & Saule, E. (2018b). Local is good: A fast citation recommendation approach. In European conference on information retrieval (pp. 758–764). Springer.

  • Jiang, Z., Liu, X., & Gao, L. (2015). Chronological citation recommendation with information-need shifting. In Proceedings of the 24th ACM international on conference on information & knowledge management (pp. 1291–1300).

  • Jiang, Z., Yin, Y. Gao, L., Lu, Y., & Liu, X. (2018). Cross-language citation recommendation via hierarchical representation learning on heterogeneous graph. In The 41st international ACM SIGIR conference on research & development in information retrieval (pp. 635–644).

  • Jiang, X., Zhu, R., Li, S., & Ji, P. (2020). Co-embedding of nodes and edges with graph neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence.

  • Kataria, S., Mitra, P., & Bhatia, S. (2010). Utilizing context in generative bayesian models for linked corpus. In Twenty-fourth AAAI conference on artificial intelligence.

  • Keshavarz, H., Seifi, S. T., & Izadi, M. (2019). A deep learning-based approach for measuring the domain similarity of persian texts. arXiv preprintarXiv:1909.09690.

  • Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. CoRR.

  • Kobayashi, Y., Shimbo, M., & Matsumoto, Y. (2018). Citation recommendation using distributed representation of discourse facets in scientific articles. In Proceedings of the 18th ACM/IEEE on joint conference on digital libraries (pp. 243–251).

  • Lin, Y., Liu, Z., Sun, M., Liu, Y., & Zhu, X. (2015). Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the twenty-ninth AAAI conference on artificial intelligence (pp. 2181–2187).

  • Liu, H., Kou, H., Yan, C., & Qi, L. (2019). Link prediction in paper citation network to construct paper correlation graph. EURASIP Journal on Wireless Communications and Networking (p. 233).

  • Ma, N., Guan, J., & Zhao, Y. (2008). Bringing pagerank to the citation analysis. Information Processing & Management (pp. 800–810).

  • Ma, A., You, F., Jing, M., Li, J., & Lu, K. (2020). Multi-source domain adaptation with graph embedding and adaptive label prediction. Information Processing & Management (p. 102367).

  • McNee, S. M., Albert, I., Cosley, D., Gopalkrishnan, P., Lam, S. K., Rashid, A. M., Konstan, J. A., & Riedl, J. (2002). On the recommending of citations for research papers. In Proceedings of the 2002 ACM conference on computer supported cooperative work (pp. 116–125).

  • Meng, F., Gao, D., Li, W., Sun, X., & Hou, Y. (2013). A unified graph model for personalized query-oriented reference paper recommendation. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management (pp. 1509–1512).

  • Miller, G. A. (1995). Wordnet: A lexical database for English. Commun. ACM (pp. 39–41).

  • Naak, A., Hage, H., & Aïmeur, E. (2009). A multi-criteria collaborative filtering approach for research paper recommendation in papyres. In Gilbert Babin, Peter Kropf, and Michael Weiss, editors, E-Technologies: Innovation in an Open World (pp. 25–39).

  • Najafabadi, M. K., Mohamed, A., & Onn, C. W. (2019). An impact of time and item influencer in collaborative filtering recommendations using graph-based model. Information Processing & Management, 56(3), 526–540.

    Article  Google Scholar 

  • Nallapati, R. M., Ahmed, A., Xing, E. P., & Cohen, W. W. (2008). Joint latent topic models for text and citations. In Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 542–550).

  • Nguyen, D. Q., Nguyen, T. D., Nguyen, D. Q., & Phung, D. (2018). A novel embedding model for knowledge base completion based on convolutional neural network. In The 16th annual conference of the North American Chapter of the Association for computational linguistics: Human language technologies (NAACL-HLT) (pp. 327–333).

  • Nickel, M., Tresp, V., & Kriegel, H.-P. (2011). A three-way model for collective learning on multi-relational data. In Proceedings of the 28th international conference on international conference on machine Learning (pp. 809–816).

  • Nozza, D., Fersini, E., & Messina, E. (2020). Cage: Constrained deep attributed graph embedding. Information Sciences, 518, 56–70.

    Article  Google Scholar 

  • Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The pagerank citation ranking: Bringing order to the web. In WWW 1999.

  • Perozzi, B., Al-Rfou, R., & Skiena, S. (2014). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 701–710).

  • Pinski, G., & Narin, F. (1976). Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics. Information Processing & Management (pp. 297–312).

  • Pornprasit, C., Liu, X., Kertkeidkachorn, N., Kim, K.-S., Noraset, T., & Tuarob, S. (2020). Convcn: A cnn-based citation network embedding algorithm towards citation recommendation. In Proceedings of the ACM/IEEE joint conference on digital libraries in 2020 (pp. 433–436).

  • Qian, Y., Liu, Y., Xu, X., & Sheng, Q. Z. (2020). Leveraging citation influences for modeling scientific documents. World Wide Web (pp. 1–22).

  • Savov, P., Jatowt, A., & Nielek, R. (2020). Identifying breakthrough scientific papers. Information Processing & Management.

  • Schafer, J. B., Frankowski, D., Herlocker, J., & Sen, S. (2007). Collaborative Filtering Recommender Systems (pp. 291–324).

  • Seeger, M. (2003). Bayesian gaussian process models: Pac-bayesian generalisation error bounds and sparse approximations.

  • Seglen, P. O. (1997). Citations and journal impact factors: Questionable indicators of research quality. Allergy (pp. 1050–1056).

  • Singh, V., Verma, S., & Chaurasia, S. S. (2020). Mapping the themes and intellectual structure of corporate university: Co-citation and cluster analyses. Scientometrics, 122(3), 1275–1302.

    Article  Google Scholar 

  • Tabrizi, S. A., Shakery, A., Zamani, H., & Tavallaei, M. A. (2018). Person: Personalized information retrieval evaluation based on citation networks. Information Processing & Management (pp. 630–656).

  • Tang, J., & Zhang, J. (2009). A discriminative approach to topic-based citation recommendation. In Thanaruk Theeramunkong, Boonserm Kijsirikul, Nick Cercone, and Tu-Bao Ho, editors, Advances in Knowledge Discovery and Data Mining (pp. 572–579).

  • Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., & Mei, Q. (2015). Line: Large-scale information network embedding. In Proceedings of the 24th international conference on world wide web (pp. 1067–1077).

  • Tang, J., Sun, J., Wang, C., & Yang, Z. (2009). Social influence analysis in large-scale networks. In Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 807–816).

  • Taşkın, Z., & Al, U. (2018). A content-based citation analysis study based on text categorization. Scientometrics (pp. 335–357).

  • Torres, R., McNee, S. M., Abel, M., Konstan, J. A., & Riedl, J. (2004). Enhancing digital libraries with techlens+. In Proceedings of the 4th ACM/IEEE-CS joint conference on digital libraries (pp. 228–236).

  • Tuarob, S., Bhatia, S., Mitra, P., & Giles, C. L. (2016). Algorithmseer: A system for extracting and searching for algorithms in scholarly big data. IEEE Transactions on Big Data (pp. 3–17).

  • Tuarob, S., Mitra, P., & Giles, C. L. (2012). Improving algorithm search using the algorithm co-citation network. In Proceedings of the 12th ACM/IEEE-CS joint conference on digital libraries (pp. 277–280).

  • Tuarob, S., Pouchard, L. C., & Giles, C. L. (2013). Automatic tag recommendation for metadata annotation using probabilistic topic modeling. In Proceedings of the 13th ACM/IEEE-CS joint conference on digital libraries (pp. 239–248).

  • Tuarob, S., Pouchard, L. C., Mitra, P., & Giles, C. L. (2015). A generalized topic modeling approach for automatic document annotation. International Journal on Digital Libraries (pp. 111–128).

  • Tuarob, S., Kang, S. W., Wettayakorn, P., Pornprasit, C., Sachati, T., Hassan, S. U., & Haddawy, P. (2020). Automatic classification of algorithm citation functions in scientific literature. IEEE Transactions on Knowledge and Data Engineering, 32(10), 1881–1896. https://doi.org/10.1109/TKDE.2019.2913376.

    Article  Google Scholar 

  • Wang, Z., Zhang, J., Feng, J., & Chen, Z. (2014). Knowledge graph embedding by translating on hyperplanes. In Proceedings of the twenty-eighth AAAI conference on artificial intelligence (pp. 1112–1119).

  • Wang, J., Zhu, L., Dai, T., & Wang, Y. (2020). Deep memory network with bi-lstm for personalized context-aware citation recommendation. Neurocomputing (pp. 103–113).

  • Yan, E., & Ding, Y. (2011). Discovering author impact: A pagerank perspective. Information Processing & Management, 47(1), 125–134.

    Article  Google Scholar 

  • Yang, C., Wei, B., Wu, J., Zhang, Y., & Zhang, L. (2009). Cares: A ranking-oriented cadal recommender system. In Proceedings of the 9th ACM/IEEE-CS joint conference on digital libraries (pp. 203–212).

  • Zhang, Y., & Ma, Q. (2020). Doccit2vec: Citation recommendation via embedding of content and structural contexts. IEEE Access (pp. 115865–115875).

  • Zhang, S., Zhao, D., Cheng, R., Cheng, J., & Wang, H. (2016). Finding influential papers in citation networks. In 2016 IEEE first international conference on data science in cyberspace (DSC) (pp. 658–662).

  • Zhou, D., Zhu, S., Yu, K., Song, X., Tseng, B. L, Zha, H., & Giles, C. L. (2008). Learning multiple graphs for document recommendations. In Proceedings of the 17th international conference on World Wide Web (pp. 141–150).

  • Zhu, Q., Zhou, X., Zhang, P., & Shi, Y. (2019). A neural translating general hyperplane for knowledge graph embedding. Journal of computational science, 30, 108–117.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This research is supported by the Thailand Science Research and Innovation (TSRI), formerly known as Thailand Research Fund (TRF), through Grant RSA6280105. In addition, we would also like to acknowledge partial support from JSPS Grant-in-Aid for Scientific Research (Grant No. 21K12042) and the New Energy and Industrial Technology Development Organization (Grant No. JPNP20006). This manuscript is an extension of the authors' earlier work presented at the ACM/IEEE 20th Joint Conference on Digital Libraries (JCDL 2020) (Pornprasit et al. 2020).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Suppawong Tuarob.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pornprasit, C., Liu, X., Kiattipadungkul, P. et al. Enhancing citation recommendation using citation network embedding. Scientometrics 127, 233–264 (2022). https://doi.org/10.1007/s11192-021-04196-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-021-04196-3

Keywords

Navigation