Abstract
The measurement of the quality of academic research is a rather controversial issue. Recently Hirsch has proposed a measure that has the advantage of summarizing in a single summary statistics the information that is contained in the citation counts of each scientist. From that seminal paper, a huge amount of research has been lavished, focusing on one hand on the development of correction factors to the h index and on the other hand, on the pros and cons of such measure proposing several possible alternatives. Although the h index has received a great deal of interest since its very beginning, only few papers have analyzed its statistical properties and implications. In the present work we propose a statistical approach to derive the distribution of the h index. To achieve this objective we work directly on the two basic components of the h index: the number of produced papers and the related citation counts vector, by introducing convolution models. Our proposal is applied to a database of homogeneous scientists made up of 131 full professors of statistics employed in Italian universities. The results show that while “sufficient” authors are reasonably well detected by a crude bibliometric approach, outstanding ones are underestimated, motivating the development of a statistical based h index. Our proposal offers such development and in particular confidence intervals to compare authors as well as quality control thresholds that can be used as target values.
Similar content being viewed by others
References
Ball, P. (2005). Index aims for fair ranking of scientists. Nature, 436, 900.
Beirlant, J. & Einmahl, J.H.J. (2010). Asymptotics for the Hirsch index. Scandinavian Journal of Statistics 37, 355–364.
Bühlmann, H. (1970). Mathematical methods in risk theory, Grundlehrenband 172. Heidelberg: Springer.
Burrell, Q.L. (2007). Hirsch’s h-index: A stochastic model. Journal of Informetrics, 1, 16–25.
Cerchiello, P., Giudici, P. (2012). On the distribution of functionals of discrete ordinal variables. Statistics & Probability Letters, 82, 2044–2049.
Cruz, M.G. (2002). Modeling, measuring and hedging operational risk. London: Wiley.
Dalla Valle, L. & Giudici P., (2008). A Bayesian approach to estimate the marginal loss distributions in operational risk management. Computational Statistics & Data Analysis, 52, 3107–3127.
Evert, S., (2004). A simple LNRE model for random character sequences. In Proceedings of the 7mes Journes Internationales dAnalyse Statistique des Donnes Textuelles (JADT 2004) (pp. 411–422). Louvain-la-Neuve, Belgium
Evert, S. & Baroni, M., (2007). zipfR: Word frequency distributions in R. In Proceedings of the 45th annual meeting of the association for computational linguistics, posters and demonstrations session. Prague, Czech Republic.
Frachot, A., Moudoulaud, O. & Roncalli, T. (2007). Loss distribution approach in practice. In M.K. Ong (Ed.), The basel handbook. A guide for financial practitioners. London: Risk Books.
Gabaix, X. (2009). Power laws in economics and finance. Annual Review of Economics, 1, 255–293.
Glanzel, W. (2006) On the h-index—A mathematical approach to a new measure of publication activity and citation impact. Scientometrics, 67, 315–321.
Harzing, A.W. (2007). Publish or Perish, available from http://www.harzing.com/pop.htm.
Hirsch, J.E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102, 16569–16572.
Iglesias, J.E. & Pecharroman C. (2007). Scaling the h-index for different scientific ISI fields. Scientometrics, 73, 303–320.
Izsak, F. (2006). Maximum likelihood estimation for constrained parameters of multinomial distributions—Application to Zipf-Mandelbrot models. Computational Statistics & Data Analysis, 51, 1575–1583.
Mandelbrot, B. (1962). On the theory of word frequencies and on related Markovian models of discourse. In R. Jakobson (Ed.), Structure of language and its mathematical aspects (pp. 190–219). Providence, RI: American Mathematical Society.
Pratelli, L., Baccini, A., Barabesi, L. & Marcheselli, M., (2012). Statistical analysis of the Hirsch index. Scandinavian Journal of Statistics, 39, 681–694.
Siegel, S. & Castellan N.J., (1988). Nonparametric statistics for the behavioral sciences (2nd ed.). New York: McGraw-Hill.
Todeschini, R. (2011). The j-index: A new bibliometric index and multivariate comparisons between other common indices. Scientometrics, 87, 621–639.
Acknowledgments
The authors thank the referee(s) for the useful comments and suggestion. The authors also thank the financial support of the project MIUR PRIN MISURA—‘Multivariate models for risk assessment’.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cerchiello, P., Giudici, P. On a statistical h index. Scientometrics 99, 299–312 (2014). https://doi.org/10.1007/s11192-013-1194-2
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-013-1194-2