Skip to main content
Log in

Predicting the impact and publication date of individual scientists’ future papers

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Predicting the future career of individual scientists is an important yet challenging problem with numerous applications such as recruitment of scientific research positions, promoting outstanding academic staff, and managing scientific grant proposals. Despite that much effort has been devoted to predict scientists’ future performance and success, yet these works focus on the macro future performance of scholars from the perspective of their career ages. A related but different task is to predict the impact and publication date of each future paper. We regard this micro level prediction problem as a dynamic series auto-regression task, and a deep learning method is designed to solve it. The experiments show that our method outperforms the state-of-the-art method in this issue.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://journals.aps.org/datasets.

  2. https://www.tensorflow.org/.

References

  • Abrishami, A., Aliakbary, S. (2019). Predicting citation counts based on deep neural network learning techniques. Journal of Informetrics 13(2), 485–499.

    Article  Google Scholar 

  • Acuna, D. E., Allesina, S., Kording, K. P. (2012). Predicting scientific success. Nature 489(7415), 201–202.

    Article  Google Scholar 

  • Akella, A. P., Alhoori, H., Kondamudi, P. R., Freeman, C., & Zhou, H. (2021). Early indicators of scientific impact: Predicting citations with altmetrics. Journal of Informetrics, 15(2), 101128.

  • Bai, S., Kolter, J.Z., Koltun, V. (2018). An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:180301271

  • Bao, P., Zhai, C. (2017). Dynamic credit allocation in scientific literature. Scientometrics 112(1), 595–606.

    Article  Google Scholar 

  • Bengio, S., Vinyals, O., Jaitly, N., Shazeer, N. (2015). Scheduled sampling for sequence prediction with recurrent neural networks. arXiv preprint arXiv:150603099

  • Cao, X., Chen, Y., Liu, K. R. (2016). A data analytic approach to quantifying scientific impact. Journal of Informetrics 10(2), 471–484.

    Article  Google Scholar 

  • Chen, X., Zhang, B., Gao, D. (2021). Bearing fault diagnosis base on multi-scale cnn and lstm model. Journal of Intelligent Manufacturing 32, 971–987.

    Article  Google Scholar 

  • Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y. (2014). Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:14061078

  • Dong, Y., Johnson, R. A., Chawla, N. V. (2016). Can scientific impact be predicted?. IEEE Transactions on Big Data 2(1), 18–30.

    Article  Google Scholar 

  • de Abreu, Batista-Jr A., Gouveia, F. C., & Mena-Chalco, J. P. (2021). Predicting the q of junior researchers using data from the first years of publication. Journal of Informetrics, 15(2), 101130.

  • Fu, L., Aliferis, C. (2010). Using content-based and bibliometric features for machine learning models to predict citation counts in the biomedical literature. Scientometrics 85(1), 257–270.

    Article  Google Scholar 

  • García-Pérez, M. A. (2013). Limited validity of equations to predict the future h index. Scientometrics 96(3), 901–909.

    Article  Google Scholar 

  • Geisser, S. (1974). A predictive approach to the random effect model. Biometrika 61(1), 101–107.

    Article  MathSciNet  Google Scholar 

  • Goodfellow, I., Bengio, Y., Courville, A. (2016). Deep learning. MIT press

  • Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences 102(46), 16569–16572.

    Article  Google Scholar 

  • Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735–1780.

    Article  Google Scholar 

  • Jean, S., Cho, K., Memisevic, R., Bengio, Y. (2014). On using very large target vocabulary for neural machine translation. arXiv preprint arXiv:14122007

  • Kong, X., Zhou, J., Zhang, J., Wang, W., Xia, F. (2015) Taprank: A time-aware author ranking method in heterogeneous networks. In 2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity), IEEE, pp 242–246.

  • Kuncel, N. R., Hezlett, S. A. (2007). Standardized tests predict graduate students’ success. Science 315(5815), 1080–1081.

    Article  Google Scholar 

  • Lea, C., Vidal, R., Reiter, A., Hager, G. D. (2016). Temporal convolutional networks: A unified approach to action segmentation. In European Conference on Computer Vision, Springer, pp 47–54.

  • Lea, C., Flynn, M. D., Vidal, R., Reiter, A., Hager, G. D. (2017). Temporal convolutional networks for action segmentation and detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 156–165.

  • Lee, D. H. (2019). Predicting the research performance of early career scientists. Scientometrics 121(3), 1481–1504.

    Article  Google Scholar 

  • Li, M., Xu, J., Ge, B., Liu, J., Jiang, J., & Zhao, Q. (2019). A deep learning methodology for citation count prediction with large-scale biblio-features. In 2019 IEEE International Conference on Systems (pp. 1172–1176). IEEE: Man and Cybernetics (SMC).

  • Li, X., & Wu, X. (2015). Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition. In 2015 IEEE International Conference on Acoustics (pp. 4520–4524). IEEE: Speech and Signal Processing (ICASSP).

  • Lipton, Z. C., Berkowitz, J., Elkan, C. (2015). A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:150600019

  • Liu, L., Wang, Y., Sinatra, R., Giles, C. L., Song, C., Wang, D. (2018). Hot streaks in artistic, cultural, and scientific careers. Nature 559 (7714), 396–399.

    Article  Google Scholar 

  • Lü, L., & Zhou, T. (2011). Link prediction in complex networks: A survey. Physica A: statistical mechanics and its applications, 390(6), 1150–1170.

    Article  Google Scholar 

  • Lu, W., Li, J., Li, Y., Sun, A., Wang, J.(2020). A cnn-lstm-based model to forecast stock prices. Complexity 2020.

  • Mariani, M. S., Medo, M., Zhang, Y. C. (2016). Identification of milestone papers through time-balanced network centrality. Journal of Informetrics 10(4), 1207–1223.

    Article  Google Scholar 

  • Mazloumian, A. (2012). Predicting scholars’ scientific impact. PloS one, 7(11), e49246.

    Article  Google Scholar 

  • McCarty, C., Jawitz, J. W., Hopkins, A., Goldman, A. (2013). Predicting author h-index using characteristics of the co-author network. Scientometrics 96(2), 467–483.

    Article  Google Scholar 

  • Mistele, T., Price, T., Hossenfelder, S. (2019). Predicting authors’ citation counts and h-indices with a neural network. Scientometrics 120(1), 87–104.

    Article  Google Scholar 

  • Newman, M. (2014). Prediction of highly cited papers. EPL (Europhysics Letters), 105(2), 28002.

    Article  Google Scholar 

  • Newman, M. E. (2009). The first-mover advantage in scientific publication. EPL (Europhysics Letters) 86(6), 68001.

    Article  Google Scholar 

  • Nezhadbiglari, M., Gonçalves, M. A., Almeida, J. M. (2016). Early prediction of scholar popularity. In Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, pp 181–190.

  • Penner, O., Pan, R. K., Petersen, A. M., Kaski, K., & Fortunato, S. (2013). On the predictability of future impact in science. Scientific Reports, 3(1), 1–8.

    Article  Google Scholar 

  • Qi M, Zeng A, Li M, Fan Y, Di Z (2017) Standing on the shoulders of giants: The effect of outstanding scientists on young collaborators’ careers. Scientometrics 111(3), 1839–1850.

    Article  Google Scholar 

  • Qin, D., Yu, J., Zou, G., Yong, R., Zhao, Q., Zhang, B. (2019). A novel combined prediction scheme based on cnn and lstm for urban pm 2.5 concentration. IEEE Access 7, 20050–20059.

    Article  Google Scholar 

  • Ruan, X., Zhu, Y., Li, J., & Cheng, Y. (2020). Predicting the citation counts of individual papers via a bp neural network. Journal of Informetrics, 14(3), 101039.

    Article  Google Scholar 

  • Sarigöl, E., Pfitzner, R., Scholtes, I., Garas, A., Schweitzer, F. (2014). Predicting scientific success based on coauthorship networks. EPJ Data Science 3, 1–16.

    Article  Google Scholar 

  • Sayyadi, H., Getoor, L. (2009). Futurerank: Ranking scientific articles by predicting their future pagerank. In Proceedings of the 2009 SIAM International Conference on Data Mining, SIAM, pp 533–544.

  • Sinatra, R., Wang, D., Deville, P., Song, C., Barabási, A. L. (2016). Quantifying the evolution of individual scientific impact. Science, 354 (6312).

  • Singh, S. P., Kumar, A., Darbari, H., Singh, L., Rastogi, A., Jain, S. (2017). Machine translation using deep learning: An overview. In 2017 international conference on computer, communications and electronics (comptelix), IEEE, pp 162–167.

  • Sutskever, I., Vinyals, O., Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pp 3104–3112.

  • Vaccario, G., Medo, M., Wider, N., & Mariani, M. S. (2017). Quantifying and suppressing ranking bias in a large citation network. Journal of informetrics, 11(3), 766–782.

    Article  Google Scholar 

  • Wang, D., Song, C., Barabási, A. L. (2013). Quantifying long-term scientific impact. Science 342(6154), 127–132.

    Article  Google Scholar 

  • Xing, Y., Wang, F., Zeng, A., & Ying, F. (2021). Solving the cold-start problem in scientific credit allocation. Journal of Informetrics, 15(3), 101157.

    Article  Google Scholar 

  • Yan, J., Mu, L., Wang, L., Ranjan, R., & Zomaya, A. Y. (2020). Temporal convolutional networks for the advance prediction of enso. Scientific reports, 10(1), 1–15.

    Article  Google Scholar 

  • Yin, C., Zhang, S., Wang, J., Xiong, N. N. (2020). Anomaly detection based on convolutional recurrent autoencoder for iot time series. IEEE Transactions on Systems, Man, and Cybernetics: Systems.

  • Yu, T., Yu, G., Li, P. Y., Wang, L. (2014). Citation impact prediction for scientific papers using stepwise regression analysis. Scientometrics 101(2), 1233–1252.

    Article  Google Scholar 

  • Yu, X., Gu, Q., Zhou, M., Han, J. (2012). Citation prediction in heterogeneous bibliographic networks. In Proceedings of the 2012 SIAM international conference on data mining, SIAM, pp 1119–1130.

  • Zen, H., & Sak, H. (2015). Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis. 2015 IEEE International Conference on Acoustics (pp. 4470–4474). IEEE: Speech and Signal Processing (ICASSP).

  • Zeng, A., Shen, Z., Zhou, J., Wu, J., Fan, Y., Wang, Y., Stanley, H. E. (2017). The science of science: From the perspective of complex systems. Physics Reports 714, 1–73.

    Article  MathSciNet  Google Scholar 

  • Zhang, C., Liu, C., Yu, L., Zhang, Z. K., Zhou, T. (2016). Identifying the academic rising stars. arXiv preprint arXiv:160605752

  • Zhang, F., & Wu, S. (2020). Predicting future influence of papers, researchers, and venues in a dynamic academic network. Journal of Informetrics, 14(2), 101035.

    Article  Google Scholar 

  • Zhang, R., Yuan, Z., Shao, X. (2018). A new combined cnn-rnn model for sector stock price analysis. In 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), IEEE, vol 2, pp 546–551.

  • Zhao, Q. (2020). Utilizing citation network structure to predict citation counts: A deep learning approach. arXiv preprint arXiv:200902647

  • Zhou, J., Zeng, A., Fan, Y , Di, Z. (2016). Ranking scientific publications with similarity-preferential mechanism. Scientometrics 106(2), 805–816.

    Article  Google Scholar 

  • Zhou, Y., Wang, R., Zeng, A., & Zhang, Y. C. (2020). Identifying prize-winning scientists by a competition-aware ranking. Journal of Informetrics, 14(3), 101038.

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China under Grant 71731002. Rui-Jie Wang acknowledges the support from the China Scholarship Council (CSC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to An Zeng.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, Y., Wang, R. & Zeng, A. Predicting the impact and publication date of individual scientists’ future papers. Scientometrics 127, 1867–1882 (2022). https://doi.org/10.1007/s11192-022-04286-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-022-04286-w

Keywords

Navigation