Abstract
With the rapid development of scientific research, a large number of scientific papers are produced every year. It is very important to find influential papers quickly from the massive literature resources, which can not only help researchers identify papers with reference value, but also help scientific research management departments to allocate resources. Among the quantification measures of academic impact, citation count stands out for its frequent use in the research community. Previous studies have either treated papers as independent individuals without considering their citation relationships in the citation network or have not adequately considered the long-time dependence of citation time series. In this paper, we consider the structural features of citation networks and propose a deep learning method AGSTA-NET from the perspective of spatio-temporal fusion, which models heterogeneous citation networks formed early in the publication of a paper and predicts the citation count for an article in the next few years. AGSTA-NET contains capturing module of spatial dependence and capturing module of time dependence. It could fully dig the complex spatio-temporal information from the dynamic heterogeneous citation network by only inputting the heterogeneous citation network to the model. Meanwhile, the sub-networks designed in this paper could adaptively determine the threshold of the loss function according to the samples for better training. Experiments validate that AGSTA-NET outperforms current state-of-the-art methods in citation count prediction.

















Similar content being viewed by others
References
Abrishami, A., & Aliakbary, S. (2019). Predicting citation counts based on deep neural network learning techniques. Journal of Informetrics, 13(2), 485–499. https://doi.org/10.1016/j.joi.2019.02.011
Aksnes, D. W. (2003). Characteristics of highly cited papers. Research Evaluation, 12(3), 159–170. https://doi.org/10.3152/147154403781776645
Bhat, H. S., Huang, L. H., Rodriguez, S., Dale, R., & Heit, E. (2016). citation prediction using diverse features. IEEE International Conference on Data Mining Workshop. https://doi.org/10.1109/ICDMW.2015.131
Chakraborty, T., Kumar, S., Goyal, P., Ganguly, N., & Mukherjee, A. (2014). Towards a stratified learning approach to predict future citation counts. IEEE/ACM Joint Conference on Digital Libraries. https://doi.org/10.1109/JCDL.2014.6970190
Chan, H. F., Mixon, F. G., & Torgler, B. (2018). Relation of early career performance and recognition to the probability of winning the nobel prize in economics. Scientometrics, 114, 1069–1086. https://doi.org/10.1007/s11192-017-2614-5
Dahl, G. E., Yu, D., Deng, L., & Acero, A. (2011). Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on Audio Speech & Language Processing, 20(1), 30–42. https://doi.org/10.1109/TASL.2011.2134090
Didegah, F., & Thelwall, M. (2013). Which factors help authors produce the highest impact research? collaboration, journal and document properties. Journal of Informetrics, 7(4), 861–873. https://doi.org/10.1016/j.joi.2013.08.006
Fiala, D., & Tutoky, G. (2017). Pagerank-based prediction of award-winning researchers and the impact of citations. Journal of Informetrics, 11(4), 1044–1068. https://doi.org/10.1016/j.joi.2017.09.008
Fu, L. D., & Aliferis, C. (2008). Models for predicting and explaining citation count of biomedical articles. American Medical Informatics Association Annual Symposium Proceedings, 2008, 222–226.
Grover, A., & Leskovec, J. (2016). Node2vec: Scalable feature learning for networks. KDD '16: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. https://doi.org/10.1145/2939672.2939754
Guo, J. L., & Suo, Q. (2014). Comment on “quantifying long-term scientific impact.” Computer Science, 392(9), 2311–2314. https://doi.org/10.1126/science.124877
Havemann, F., & Larsen, B. (2015). Bibliometric indicators of young authors in astrophysics: Can later stars be predicted? Scientometrics, 102, 1413–1434. https://doi.org/10.1007/s11192-014-1476-3
Hirsch, J. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America (PNAS), 102(46), 16569–16572. https://doi.org/10.1073/pnas.0507655102
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Ke, Q., Ferrara, E., Radicchi, F., & Flammini, A. (2015). Defining and identifying sleeping beauties in science. Proceedings of the National Academy of Sciences of the United States of America, 112(24), 7426–7431. https://doi.org/10.1073/pnas.1424329112
Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. ICLR. https://doi.org/10.48550/arXiv.1609.02907
Li, S., Zhao, W. X., Yin, E. J., & Wen, J. R. (2019). A neural citation count prediction model based on peer review text. proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). https://doi.org/10.18653/v1/D19-1497
Li, C. T., Lin, Y. J., Rui, Y., & Yeh, M. Y. (2015). Trend-based citation count prediction for research articles. Pacific-Asia Conference on Knowledge Discovery and Data Mining. https://doi.org/10.1007/978-3-319-18038-0_51
Lokker, C., Mckibbon, K. A., Mckinlay, R. J., Wilczynski, N. L., & Haynes, R. B. (2018). Prediction of citation counts for clinical articles at two years using data available within three weeks of publication: Retrospective cohort study. BMJ, 336, 655–657. https://doi.org/10.1136/bmj.39482.526713.BE
Markusova, V., & Garfield, E. (2006). The history and meaning of the journal impact factor. JAMA, 295(1), 90–93. https://doi.org/10.1001/jama.295.1.90
Nie, Y., Zhu, Y., Lin, Q., Zhang, S., Shi, P., & Niu, Z. (2019). Academic rising star prediction via scholar’s evaluation model and machine learning techniques. Scientometrics, 120, 461–476.
Oppenheim, C. (1995). The correlation between citation counts and the 1992 research assessment exercise ratings for British library and information science university departments. Journal of Documentation, 51(1), 18–27. https://doi.org/10.1108/EUM0000000007207
Perozzi, B., Al-Rfou, R., & Skiena, S. (2014). Deepwalk: online learning of social representations. KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 701–710. https://doi.org/10.1145/2623330.2623732
Raan, A. J. V. (2004). Sleeping beauties in science. Scientometrics, 59(3), 467–472. https://doi.org/10.1023/b:scie.0000018543.82441.f1
Ribeiro, L., Saverese, P., & Figueiredo, D. R. (2017). Struc2vec: Learning node representations from structural identity. the 23rd ACM SIGKDD international conference. ACM. https://doi.org/10.1145/3097983.3098061
Robson, B. J., & Mousques, A. (2014). Predicting citation counts of environmental modelling papers. international environmental modelling and software society (iEMSs) 7th international congress on environmental modelling and software.
Severyn, A., & Moschitti, A. (2015). Learning to rank short text pairs with convolutional deep neural networks. the 38th international ACM SIGIR conference. ACM, 373–382. https://doi.org/10.1145/2766462.2767738
Shen, H. W., Wang, D., Song, C., & Barabási, A. L. (2014). Modelling and predicting popularity dynamics via reinforced poisson processes. AAAI Press. https://doi.org/10.1609/aaai.v28i1.8739
Shuai, X., Yan, J., Li, C., Bo, J., Wang, X., & Yang, X., et al. (2016). On modelling and predicting individual paper citation count over time. proceedings of the twenty-fifth international joint conference on artificial intelligence. 2676–2682.
Sutskever, I., Martens, J., & Hinton, G. E. (2011). Generating text with recurrent neural networks. ICML.
Tang, J., Zhang, J., Yao, L., Li, J., & Su, Z.. (2008). ArnetMiner: extraction and mining of academic social networks. proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining. KDD '08: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, 990–998. https://doi.org/10.1145/1401890.1402008
Tian, Y., Yu, G., Li, P. Y., & Liang, W. (2014). Citation impact prediction for scientific papers using stepwise regression analysis. Scientometrics, 101, 1233–1252.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., & Gomez, A. N., et al. (2017). Attention is all you need. 31st conference on neural information processing systems.
Velikovi, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., & Bengio, Y. (2017). Graph attention networks. ICLR. https://doi.org/10.48550/arXiv.1710.10903
Wang, M., Wang, Z., & Chen, G. (2019). Which can better predict the future success of articles? bibliometric indices or alternative metrics. Scientometrics, 119, 1575–1595.
Wang, M., & Yu, G. (2011). Mining typical features for highly cited papers. Scientometrics, 87(3), 695–706. https://doi.org/10.1007/s11192-011-0366-1
Weihs, L., & Etzioni, O. (2017). Learning to predict citation-based impact measures. ACM/IEEE Joint Conference on Digital Libraries (JCDL), 2017, 1–10. https://doi.org/10.1109/JCDL.2017.7991559
Wen, J., Wu, L., & Chai, J. (2020). Paper citation count prediction based on recurrent neural network with gated recurrent unit. 2020 IEEE 10th international conference on electronics information and emergency communication (ICEIEC). 303–306. https://doi.org/10.1109/ICEIEC49280.2020.9152330
Wu, S., Zhong, S., & Liu, Y. (2017). Deep residual learning for image steganalysis. Multimedia Tools and Applications, 77, 10437–10453. https://doi.org/10.1007/s11042-017-4440-4
Xr, A., Yz, B., Jiang, L. A., & Ying, C. A. (2020). Predicting the citation counts of individual papers via a bp neural network - sciencedirect. Journal of Informetrics. https://doi.org/10.1016/j.joi.2020.101039
Xu, J., Li, M., Jiang, J., & Ge, B. (2019). Early prediction of scientific impact based on multi-bibliographic features and convolutional neural network. IEEE Access, 7, 92248–92258. https://doi.org/10.1109/ACCESS.2019.2927011
Yan, R., Tang, J., Liu, X., D Shan, & Li, X.. (2011). Citation count prediction: Learning to estimate future citations for literature. CIKM '11: Proceedings of the 20th ACM international conference on information and knowledge management. 1247–1252. https://doi.org/10.1145/2063576.2063757
Yan, R., Huang, C., Tang, J., Zhang, Y., & Li, X. (2012). To better stand on the shoulder of giants. JCDL '12: Proceedings of the 12th ACM/IEEE-CS joint conference on digital libraries. 51–60. https://doi.org/10.1145/2232817.2232831
Yuan, S., Tang, J., Zhang, Y., Wang, Y., & Xiao, T. (2018). Modelling and predicting citation count via recurrent neural network with long short-term memory. https://arxiv.org/abs/1811.02129
Zhang, F. (2017). Evaluating journal impact based on weighted citations. Scientometrics, 113(2), 1155–1169.
Author information
Authors and Affiliations
Contributions
Bin Wang Conceived and designed the analysis, collected the data, performed the analysis, wrote the paper, Feng Wu Collected the data, contributed data or analysis tools, LuKui Shi Conceived and designed the analysis, wrote the paper.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, B., Wu, F. & Shi, L. AGSTA-NET: adaptive graph spatiotemporal attention network for citation count prediction. Scientometrics 128, 511–541 (2023). https://doi.org/10.1007/s11192-022-04541-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-022-04541-0