Skip to main content
Log in

Probability and expected frequency of breakthroughs: basis and use of a robust method of research assessment

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

In research policy, effective measures that lead to improvements in the generation of knowledge must be based on reliable methods of research assessment, but for many countries and institutions this is not the case. Publication and citation analyses can be used to estimate the part played by countries and institutions in the global progress of knowledge, but a concrete method of estimation is far from evident. The challenge arises because publications that report real progress of knowledge form an extremely low proportion of all publications; in most countries and institutions such contributions appear less than once per year. One way to overcome this difficulty is to calculate probabilities instead of counting the rare events on which scientific progress is based. This study reviews and summarizes several recent publications, and adds new results that demonstrate that the citation distribution of normal publications allows the probability of the infrequent events that support the progress of knowledge to be calculated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Aitchison, J., & Brown, J. A. C. (1963). The lognormal distibution—With special reference to its uses in economics. Cambridge: Cambridge University Press.

    Google Scholar 

  • Albarrán, P., Crespo, J. A., Ortuño, I., & Ruiz-Castillo, J. (2011). The skewness of science in 219 sub-fields and a number of aggregates. Scientometrics, 88, 385–397.

    Article  Google Scholar 

  • Albarrán, P., Perianes-Rodríguez, A., & Ruiz-Castillo, J. (2015). Differences in citation impact across countries. Journal of the Association for Information Science and Technology, 66, 512–525.

    Article  Google Scholar 

  • Albarrán, P., & Ruiz-Castillo, J. (2011). References made and citations received by scientific articles. Journal of the American Society for information Science, 62, 40–49.

    Article  Google Scholar 

  • Bauke, H. (2007). Parameter estimation for powr-law distributions by maximum likelihood methods. European Physical Journal B, 58, 167–173.

    Article  Google Scholar 

  • Bonaccorsi, A. (2007). Explaining poor performance of European science: Institutions versus policies. Science and Public Policy, 34, 303–316.

    Article  Google Scholar 

  • Bornmann, L. (2012). Measuring the societal impact of research. EMBO Reports, 13, 673–676.

    Article  Google Scholar 

  • Bornmann, L. (2013). How to analyze percentile citation impact data meaningfully in bibliometrics: The statistical analysis of distributions, percentile rank classes, and top-cited papers. Journal of the American Society for information Science, 64, 587–595.

    Article  Google Scholar 

  • Bornmann, L. (2017). Measuring impact in reserach evaluations: A through discussion of methods for effects of and problems with impact measurements. Higher Education, 73, 775–787.

    Article  Google Scholar 

  • Bornmann, L., de Moya Anegón, F., & Leydesdorff, L. (2010). Do scientific advancements lean on the shoulders of giants? A bibliometric investigation of the Ortega hypothesis. PLoS ONE, 5(10), e13327.

    Article  Google Scholar 

  • Bornmann, L., & Leydesdorff, L. (2018). Count highly-cited papers instead of papers with h citations: Use normalized citations counts and compare “like with like”! Scientometrics, 115, 1119–1123.

    Article  Google Scholar 

  • Bornmann, L., Leydesdorff, L., & Mutz, R. (2013). The use of percentile rank classes in the analysis of bibliometric data: Opportunities and limits. Journal of Informetrics, 7, 158–165.

    Article  Google Scholar 

  • Bornmann, L., Ye, A., & Ye, F. (2018). Identifying landmark publications in the long run using field-normalized citation data. Journal of Documentation, 74, 278–288.

    Article  Google Scholar 

  • Bornmann, L., Ye, A. Y., & Ye, F. Y. (2017). Sequence analysis of annually normalized citation counts: An empirical analysis based on the characteristic scores and scales (CSS) method. Scientometrics, 113, 1665–1680.

    Article  Google Scholar 

  • Brito, R., & Rodríguez-Navarro, A. (2018a). Research assessment by percentile-based double rank analysis. Journal of Informetrics, 12, 315–329.

    Article  Google Scholar 

  • Brito, R., & Rodríguez-Navarro, A. (2018b). The USA is an indisputable world leader in medical and biotechnological reserach. Preprint at arXiv:1807.01225.

  • Brito, R., & Rodríguez-Navarro, A. (2019). Evaluating reserach and researchers by the journal impact factor: Is it better than coin flipping? Journal of Informetrics, 13, 314–324.

    Article  Google Scholar 

  • Campanario, J. M. (2009). Rejecting and resisting Nobel class discoveries: accounts by Nobel Laureates. Scientometrics, 81, 549–565.

    Article  Google Scholar 

  • Clauset, A., Shalizi, C. R., & Newman, M. E. J. (2009). Powe-law distributions in empirical data. SIAM Review, 51, 661–703.

    Article  MathSciNet  MATH  Google Scholar 

  • Cole, J. R., & Cole, S. (1972). The Ortega hypothesis. Citation analysis suggests that only a few scientists contibute to scientific progress. Science, 178, 368–375.

    Article  Google Scholar 

  • Commission, E. (2017). Overall output of selected geographical group camparators and related FP7- and H2020-founded publication output. Luxembourg: Publication Office of the European Union.

    Google Scholar 

  • De Bellis, N. (2009). Bibliometrics and citation analysis—From the science citation index to cybermetrics. Lanham, Maryland: The Scarecrop Press, Inc.

    Google Scholar 

  • Dosi, G., Llerena, P., & Labini, M. S. (2006). The relationships between science, technologies and their industrial exploitation: An illustration through the myths and realities of the so-called ‘European Paradox’. Research Policy, 35, 1450–1464.

    Article  Google Scholar 

  • Ettlie, J. E., Bridges, W. P., & O’Keefe, R. D. (1984). Organizatio strategy and structural differences for radical versus incremental innovation. Management Science, 30, 682–695.

    Article  Google Scholar 

  • Evans, T. S., Hopkins, N., & Kaube, B. S. (2012). Universality of performance indicators based on citation and reference counts. Scientometrics, 93, 473–495.

    Article  Google Scholar 

  • Glänzel, W. (2007). Characteristic scores and scales. A bibliometric analysis of subject characteristics based on long-term citation observation. Journal of Informetrics, 1, 92–102.

    Article  Google Scholar 

  • Glänzel, W. (2013). High-end performance or outlier? Evaluating the tail of scientometric distributions. Scientometrics, 97, 13–23.

    Article  Google Scholar 

  • Glänzel, W., & Schubert, A. (1988). Characteristic scores and scales in assessing citation impact. Journal of Information Science, 14, 123–127.

    Article  Google Scholar 

  • Goldstein, M. L., Morris, S. A., & Yen, G. G. (2004). Problems with fitting to powe-law distribution. European Physical Journal B, 41, 255–258.

    Article  Google Scholar 

  • Harnad, S. (2008). Validating research performance metrics against peer rankings. Ethics in Science and Environmental Politics, 8, 103–107.

    Article  Google Scholar 

  • Harnad, S. (2009). Open access scientometrics and the UK research assessment exercise. Scientometrics, 79, 147–156.

    Article  Google Scholar 

  • Herranz, N., & Ruiz-Castillo, J. (2013). The end of the “European Paradox”. Scientometrics, 95, 453–464.

    Article  Google Scholar 

  • Katz, J. S. (2016). What is a complex innovation system? PLoS ONE, 11(6), e0156150.

    Article  Google Scholar 

  • Kline, S. J., & Rosenberg, N. (1986). An overview of innovation. In R. Landau & N. Rosenberg (Eds.), The positive sum strategy. Harnesing yechnology for economic growth (pp. 275–305). Washington, DC: National Acasemic Press.

    Google Scholar 

  • Kuhn, T. (1970). The structure of scientific revolutions. Chicago: University of Chicago Press.

    Google Scholar 

  • Lawrence, P. A. (2016). The last 50 years: Mismeasurement and mismanagment are impeding scientific reserach. Current Topics in Developmental Biology, 116, 617–631.

    Article  Google Scholar 

  • Leydesdorff, L., Bornmann, L., Comins, J. A., & Milojevic, S. (2016a). Citations: Indicators of quality? The impact fallacy. Frontiers in Research metrics and Analytics, 1, 1. https://doi.org/10.3389/frma.2016.00001.

    Article  Google Scholar 

  • Leydesdorff, L., Wouters, P., & Bornmann, L. (2016b). Professional and citizen bibliometrics: Complementarities and ambivalences in the development and nuse of indicators—A state-of-the-art report. Scientometrics, 109, 2129–2150.

    Article  Google Scholar 

  • Li, Y., Radicchi, F., Castellano, C., & Ruiz-Castillo, J. (2013). Quantitative evaluation of alternative field normalization procedures. Journal of Informetrics, 7, 746–755.

    Article  Google Scholar 

  • MacRoberts, M. H., & MacRoberts, B. R. (2018). The mismeasure of science: Citation analysis. Journal of the Association for Information Science and Technology, 69, 474–482.

    Article  Google Scholar 

  • Martin, B. R. (2011). The Research Excellence Framework and the ‘impact agenda’: Are we creating a Frankenstein monster? Research Evaluation, 20, 247–254.

    Article  Google Scholar 

  • Merton, R. K. (1965). On the shoulders of giants: A Shandean postscript. New York: The Free Press.

    Google Scholar 

  • Miller, P., & O’Leary, T. (2007). Mediating instruments and making markets: Capital budgeting, science and the economy. Accounting, Organizations and Society, 32, 701–734.

    Article  Google Scholar 

  • National Science Board. (2016). Science and Engineering Indicators 2016. Arlington, VA: National Science Fundation.

    Google Scholar 

  • National Science Board. (2018). Science and Engineering Indicators 2018. Alexandria, VA: National Science Foundation.

    Google Scholar 

  • Newman, M. E. J. (2005). Power laws, Pareto distributions and Zipf’s law. Conterporary Physiscs, 46, 323–351.

    Google Scholar 

  • Ortega y Gasset, J. (1999). The Revolt of the Masses. New York: Norton & Company Inc.

    Google Scholar 

  • Perianes-Rodriguez, A., & Ruiz-Castillo, J. (2015). Within- and between-department variability in individual productivity: The case of economics. Scientometrics, 102, 1497–1520.

    Article  Google Scholar 

  • Press, W. H., Flannery, B. P., Teulosky, S. A., & Vetterling, W. T. (1989). Numerical Recipies, Fortran Version. Cambridge: Cambridge University Press.

    Google Scholar 

  • Radicchi, F., Fortunato, S., & Castellano, C. (2008). Universality of citation distributions: Toward an objective measure of scientific impact. Proceedings of the National Academy of Sciences of the United States of America, 105, 17268–17272.

    Article  Google Scholar 

  • Redner, S. (2005). Citation statistics from 110 years of Physical Review. Physics Today, 58, 49–54.

    Article  Google Scholar 

  • Rodríguez-Navarro, A. (2011). A simple index for the high-citation tail of citation distribution to quantify research performance in countries and institutions. PLoS ONE, 6(5), e20510.

    Article  Google Scholar 

  • Rodríguez-Navarro, A. (2012). Counting highly cited papers for university research assessment: conceptual and technical issues. PLoS ONE, 7(10), e47210.

    Article  Google Scholar 

  • Rodríguez-Navarro, A. (2016). Research assessment based on infrequent achievements: A comparison of the United States and Europe in terms of highly cited papers and Noble Prizes. Journal of the Association for Information Science and Technology, 67, 731–740.

    Article  Google Scholar 

  • Rodríguez-Navarro, A., & Brito, R. (2018a). Double rank analysis for research assessment. Journal of Informetrics, 12, 31–41.

    Article  Google Scholar 

  • Rodríguez-Navarro, A., & Brito, R. (2018b). Technological research in the EU is less efficient than the world average. EU research policy risks Europeans’ future. Journal of Informetrics, 12, 718–731.

    Article  Google Scholar 

  • Rodriguez-Navarro, A., & Narin, F. (2018). European paradox or delusion—Are European science and economy outdated? Science and Public Policy, 45, 14–23.

    Article  Google Scholar 

  • Ruiz-Castillo, J., & Waltman, L. (2015). Field-normalized citation impact indicators using algorithmically constructed classification systems of science. Journal of Informetrics, 9, 102–117.

    Article  Google Scholar 

  • Schlagberger, E. M., Bornmann, L., & Bauer, J. (2016). At what institutions did Nobel lauretae do their prize-winning work? An analysis of bibliographical information on Nobel laureates from 1994 to 2014. Scientometrics, 109, 723–767.

    Article  Google Scholar 

  • Schreiber, M. (2013). A case study of the arbitrariness of the h-index and the highly-cited publication indicator. Journal of Informetrics, 7, 379–387.

    Article  Google Scholar 

  • Schubert, A., Glanzel, W., & Braun, T. (1987). Subject field characteristic citation scores and scales for assessing reserach performance. Scientometrics, 12, 267–292.

    Article  Google Scholar 

  • Shen, Z., Yang, L., & Wu, J. (2018). Lognormal distribution of citation counts is the reason for the relation between Impact Factors and Citation Success Index. Journal of Informetrics, 12, 153–157.

    Article  Google Scholar 

  • Strathern, M. (1997). ‘Improving ratings’: Audit in the British University system. European Review, 5, 305–321.

    Article  Google Scholar 

  • Stringer, M. J., Sales-Pardo, M., & Amaral, L. A. N. (2010). Statistical validation of a global model for the distribution of the ultimate number of citations accrued by papers published in a scientific journal. Journal of the American Society for Information Science, 61, 1377–1385.

    Article  Google Scholar 

  • Teixeira da Silva, J. A., & Dobránski, J. (2018). Multiple versions of the h-index: Cautionary use for formal academic purposes. Scientometrics, 115, 1107–1113.

    Article  Google Scholar 

  • Thelwall, M., & Wilson, P. (2014a). Distributions for cited articles from individual subjects and years. Journal of Informetrics, 8, 824–839.

    Article  Google Scholar 

  • Thelwall, M., & Wilson, P. (2014b). Regression for citation data: An evaluation of different methods. Journal of Informetrics, 8, 963–971.

    Article  Google Scholar 

  • Tijssen, R. J. W., Visser, M. S., & van-Leeuwen, T. N. (2002). Benchmarking international scientific excellence: Are highly cited research papers an appropriate frame of reference? Scientometrics, 54, 381–397.

    Article  Google Scholar 

  • Traag, V. A., & Waltman, L. (2018). Systematic analysis of agreement between metrics and peer review in the UK REF. Preprint at arXiv:1808.03491.

  • UNESCO. (2016). UNESCO science report: Towards 2030 (2nd ed.). Paris, France: UNESCO.

    Google Scholar 

  • van Noorden, R. (2010). A profusion of measures. Nature, 465, 864–866.

    Article  Google Scholar 

  • van Raan, A. F. J. (2006). Comparison of the Hirsch-index with standard bibliometric indicators and with peer judgment for 147 chemistry research groups. Scientometrics, 67, 491–502.

    Article  Google Scholar 

  • Viiu, G.-A. (2017). Disaggregated research evaluation through median-based characteristic scores abd scales: A comparison with the mean-based approach. Journal of Informetrics, 11, 748–765.

    Article  Google Scholar 

  • Viiu, G.-A. (2018). The lognormal distribution explains the remarkable pattern documented by characteristic scores and scales in scientometrics. Journal of Informetrics, 12, 401–415.

    Article  Google Scholar 

  • Waltman, L. (2016). A review of the literature on citation impact indicators. Journal of Informetrics, 10, 365–391.

    Article  Google Scholar 

  • Waltman, L., & Schreiber, M. (2013). On the calculation of percentile-based bibliometric indicators. Journal of the American Society for Information Science and Technology, 64, 372–379.

    Article  Google Scholar 

  • Wang, J., Veugelers, R., & Stephan, P. (2017). Bias against novelty in science: A cautionary tale for users of bibliometric indicators. Research Policy, 46, 1416–1436.

    Article  Google Scholar 

  • Wilsdon, J., Allen, L., Belfiore, E., Campbell, P., Curry, S., Hill, S., et al. (2015). The metric tide: Report of the independent review of the role of metrics in research assessment and management. https://doi.org/10.13140/rg.2.1.4929.1363.

Download references

Acknowledgements

This work was supported by the Spanish Ministerio de Economía y Competitividad, Grant Numbers FIS2014-52486-R and FIS2017-83709-R.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alonso Rodríguez-Navarro.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rodríguez-Navarro, A., Brito, R. Probability and expected frequency of breakthroughs: basis and use of a robust method of research assessment. Scientometrics 119, 213–235 (2019). https://doi.org/10.1007/s11192-019-03022-1

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-019-03022-1

Keywords

Navigation