Abstract
In this paper, we analyze the structure of the article citation network of a particular subject obtained from the Web of Science (WoS) database. In specific, we modify a model proposed in Caldarelli et al. (Phys Rev Lett 89(25):258702, 2002) and develop a generative model for article citation networks in which an article receives citations based on a newly defined property called “importance” introduced in this paper. Since the importance of an article is quantitatively unmeasurable, we consider to use the in-degree of articles, which is the number of citations that an article of interest is cited by other articles, as a surrogate quantity to describe an article’s importance. We simulate some in-degree distributions to estimate the parameters of the tapered Pareto distribution. The generative model shows good performance in the comparison between the generated data and data from the real network, especially the citation network of recent years.









Similar content being viewed by others
References
Albert, R., Jeong, H., & Barabási, A. L. (1999). Diameter of the world-wide web. Nature, 401, 130–131.
Barabási, A. L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286, 509–512.
Barabási, A. L., Albert, R., Jeong, H., & Bianconi, G. (2000). Power-law distribution of the world wide web. Science, 287, 2115a.
Bianconi, G., & Barabási, A. L. (2001). Competition and multiscaling in evolving networks. Europhysics Letters (EPL), 54(4), 436–442.
Caldarelli, G., Capocci, A., De Rios, P., & Munoz, M. A. (2002). Scale-free networks from varying vertex intrinsic fitness. Physical Review Letters, 89(25), 258702.
Clauset, A., Shalizi, C. R., & Newman, N. E. (2009). Power-law distribution in empirical data. SIAM Review, 51(4), 661–703.
Dorogovtsev, S. N., & Mendes, J. F. F. (2001). Effect of the accelerating growth of communications networks on their structure. Physical Review E, 63(2), 025101.
Dorogovtsev, S. N., Mendes, J. F. F., & Samukhin, A. N. (2000). Structure of growing networks with preferential linking. Physical Review Letters, 85(21), 4633.
Erdős, P., & Rényi, A. (1959). On random graphs I. Publicationes Mathematicae Debrecen, 6, 290.
Gilbert, E. N. (1959). Random graphs. Annals of Mathematical Statistics, 30(4), 1141–1144.
Jung, H., Lee, J. G., Lee, N., & Kim, S. H. (2018). Comparison of fitness and popularity: Fitness-popularity dynamic network model. Journal of Statistical Mechanics, 2018(12), 123403.
Ke, Q., Ferrara, E., Radicchi, F., & Flammini, A. (2015). Defining and identifying sleeping beauties in science. Proceedings of the National Academy of Sciences, 112(24), 7246–7431.
Krapivsky, P. L., & Redner, S. (2001). Organization of growing random networks. Physical Review E, 63(6), 066123.
Kagan, Y. Y., & Schoenberg, F. P. (2001). Estimation of the upper cutoff parameter for the tapered Pareto distribution. Journal of Applied Probability, 38A, 168–185.
Mandolbrot, B. B. (1965). Information theory and psycholinguistics. In B. B. Wolman & E. Nagel (Eds.), Scientific psychology. Basic Books.
Newman, M. E. (2003). The structure and function of complex networks. SIAM Review, 45(2), 167–256.
Newman, M. E. (2005). Power laws, pareto distributions and zipf’s law. Contemporary Physics, 46(5), 323–351.
Pennock, D. M., Flake, G. W., Lawrence, S., Glover, E. J., & Giles, C. L. (2002). Winners don’t take all: Characterizing the competition for links on the web. Proceedings of the National Academy of Sciences, 99, 5207–5211.
Pham, T., Sheridan, P., & Shimodaira, H. (2016). Joint estimation of preferential attachment and node fitness in growing complex networks. Science Reports, 6, 32558.
Phoa, F. K. H., & Sanchez, J. (2013). Modeling the browsing behaviour of world wide web users. Open Journal of Statistics, 3, 145–154.
Phoa, F. K. H., & Lin, W. C. (2013). High-quality winners take more: Modeling non-scale-free bulletin forums with content variations. Journal of Data Science, 11, 559–573.
Pritchard, A. (1969). Statistical bibliography or bibliometrics? Journal of Documentation, 25(4), 348–349.
Van Noortwijk, J. M. (2009). A survey of the application of Gamma processes in maintenance. Reliability Engineering and System Safety, 94(1), 2–11.
Acknowledgements
The authors would like to thank Clarivate Analytics to provide access to the raw data of the Web of Science database for research investigations, the URA team of ISM for transforming the data into the neo4j database and providing the neo4j database for analysis in this work, and Ms. Ula Tzu-Ning Kung to provide English editing service in this paper. In addition, the authors would like to thank the two reviewers provided many constructive comments and suggestions to improve the quality of this paper. This project was partly supported by Academia Sinica Grant No. AS-TP-109-M07 and the Ministry of Science and Technology (Taiwan) Grant Nos. 107-2118-M-001-011-MY3, 107-2321-B-001-038, 108-2321-B-001-016, and 109-2321-B-001-013. The third author was partly supported by JSPS KAKENHI Grant Number JP20K11715.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chang, L.LH., Phoa, F.K.H. & Nakano, J. A generative model of article citation networks of a subject from a large-scale citation database. Scientometrics 126, 7373–7395 (2021). https://doi.org/10.1007/s11192-021-04037-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-021-04037-3