Abstract
The citation count is an important factor to estimate the relevance and significance of academic publications. However, it is not possible to use this measure for papers which are too new. A solution to this problem is to estimate the future citation counts. There are existing works, which point out that graph mining techniques lead to the best results. We aim at improving the prediction of future citation counts by introducing a new feature. This feature is based on frequent graph pattern mining in the so-called citation network constructed on the basis of a dataset of scientific publications. Our new feature improves the accuracy of citation count prediction, and outperforms the state-of-the-art features in many cases which we show with experiments on two real datasets.
Similar content being viewed by others
References
Pobiedina N, Ichise R (2014) Predicting citation counts for academic literature using graph pattern mining. In: Proceeding IEA/AIE, pp 109–119
Garfield E (2001) Impact factors, and why they won’t go away. Science 411(6837):522
Hirsch J (2005) An index to quantify an individual’s scientific research output. Proc the National Academy of Sciences of the United States America 102(46):16569
Beel J, Gipp B (2009) Google scholar’s ranking algorithm: The impact of citation counts (an empirical study). In: Proceeding RCIS, pp 439–446
Bethard S, Jurafsky D (2010) Who should I cite: learning literature search models from citation behavior. In: Proceeding CIKM, pp 609–618
Callaham M, Wears R, Weber E (2002) Journal prestige, publication bias, and other characteristics associated with citation of published studies in peer-reviewed journals. J. Am. Med. Assoc. 287(21):2847–50
Kulkarni AV, Busse JW, Shams I (2007) Characteristics associated with citation rate of the medical literature. PLOS One 2(5)
Didegah F, Thelwall M (2013) Determinants of research citation impact in nanoscience and nanotechnology. JASIST (JASIS) 64(5):1055–1064
Livne A, Adar E, Teevan J, Dumais S (2013) Predicting citation counts using text and graph mining. In: Proceeding the iConference 2013 Workshop on Computational Scientometrics: Theory and Applications
Bringmann B, Berlingerio M, Bonchi F, Gionis A (2010) Learning and predicting the evolution of social networks. IEEE Intell Syst 25:26–35
Yan R, Tang J, Liu X, Shan D, Li X (2011) Citation count prediction: learning to estimate future citations for literature. In: Proceeding CIKM, pp 1247–1252
Mcgovern A, Friedl L, Hay M, Gallagher B, Fast A, Neville J, Jensen D (2003) Exploiting relational structure to understand publication patterns in high-energy physics. SIGKDD Explorations 5:2003
Yan R, Huang C, Tang J, Zhang Y, Li X (2012) To better stand on the shoulder of giants. In: Proceeding JCDL, pp 51– 60
Barabasi AL, Albert R (1999) Emergence of scaling in random networks. Sci Mag 286(5439):509–512
Adamic LA, Adar E (2003) Friends and neighbors on the web. Soc Networks 25(3):211–230
Liben-Nowell D (2007) The link-prediction problem for social networks. JASIST 58(7):1019–1031
Munasinghe L, Ichise R (2012) Time score: A new feature for link prediction in social networks. IEICE Trans 95-D(3):821–828
Shi X, Leskovec J, McFarland D A (2010) Citing for high impact. In: Proceeding JCDL, pp 49–58
Devroye L, Gyrfi L, Lugosi G (1996) A Probabilistic Theory of Pattern Recognition. Springer
Chang CC, Lin CJ (2011) Libsvm: A library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27
Hothorn T, Hornik K, Zeileis A (2006) Unbiased recursive partitioning: A conditional inference framework. J Comp Graph Stat 15(3):651–674
Breiman L, Friedman J, Stone C J, Olshen R (1984) Classification and Regression Trees. Chapman and Hall/CRC
The R project for statistical computing http://www.r-project.org/ (January 2013)
Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1923
Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45(4):427–437
Author information
Authors and Affiliations
Corresponding author
Additional information
This is an extended and enhanced version of the results published in [1].
Rights and permissions
About this article
Cite this article
Pobiedina, N., Ichise, R. Citation count prediction as a link prediction problem. Appl Intell 44, 252–268 (2016). https://doi.org/10.1007/s10489-015-0657-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-015-0657-y