Skip to main content
Log in

Convergence or polarisation? The impact of research assessment exercises in the Italian case

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Two research assessments with an impact on university funding have taken place in Italy, covering the periods 2004–2010 and 2011–2014. After correcting grading schemes in order to grant comparability across the two exercises, we show that university final scores exhibit some convergence. We find that convergence is largely due to changes in the relative productivity of researchers who participated to both exercises as well as to hiring and promotions occurred between the two exercises. Results are confirmed even when we equalise the number of products across the two exercises. When we consider departments within universities, we still find convergence, though the structure and composition of departments is not strictly comparable, because mapping researchers involves some arbitrariness. These results suggest that convergence reflect genuine changes in the behaviour of researchers and in the strategies of assessed institutions, induced by incentives created by the national research assessment exercises.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. The first trial exercise (VTR-Valutazione Triennale della Ricerca) was organized by an ad-hoc committee (CIVR-Comitato di Indirizzo per la Valutazione della Ricerca) and covered the period 2001–2003. Universities and research centres could submit up to half of their research staff and were free to choose the number of research products to be assessed. This ended up with many universities proposing papers by their best researchers only, while others adopted alternative strategies of involving all the researchers. All products (17,329, less than one fourth of the number of products evaluated in the two following exercises) were peer-reviewed (Cuccurullo 2006).

  2. A third assessment exercise (VQR3) has been called for in 2019 to cover research activity published in 2015–2019. While evaluation results are expected in 2021 (or 2022), the methodology has been significantly modified with respect to the previous two experiences: all products will be peer-reviewed; the number of products becomes variable across researchers allowing some researchers to compensate for the absence of others; products are to be weighed by the number of coauthors; the final result will be the allocation of product in merit categories whose boundaries are not predefined. This makes these future scores not commensurable with the scores obtained during VQR1 and VQR2 that are studied in the present paper.

  3. For an overview of the first exercise and of its results, see Ancaiani et al. (2015). The final reports of the first and the second exercises were published in 2013 and 2017 and can be downloaded from the ANVUR website (www.anvur.it).

  4. The Italian research assessment exercises have evaluated universities and public research entities, each group competing for the allocation of different sources of funds. Since research entities are more heterogeneous (they are specialised in different research fields and are unevenly distributed across the nation), we focus on the assessment of universities only.

  5. Using three institutions as case studies, the authors focus on the third aspect, arguing that there is a high degree of heterogeneity among institutions and researchers in the ability to select the “best” products, with a potential impact on the rankings. For STEM (the only field where the automatic evaluation of product can be applied), the results indicate a worsening by 23–32% of the maximum score achievable, compared to the score from an efficient selection. About the inability of fully understanding the complexity of the scoring system based on the VQR algorithm see also Baccini (2016).

  6. In practice, the algorithm used by the Italian research assessment is more complicated because of the existence of additional indicators based on PhDs programs and public engagement.

  7. The two VQRs dealt with a larger number of products (179,280 and 114,431 respectively) because public research centers were also assessed. However, since they are subject to different incentives and unevenly distributed across the country, we exclude them from our analysis.

  8. There is a further difference between the two exercises: while the first required submitting 3 products for each member of the faculty over a period of 7 years, the second exercise required 2 products over 4 years. It is not a priori clear whether this difference may have any implications on our analysis. See the following paragraph on robustness checks.

  9. We omit the distribution of the non-harmonized second exercise scores because the impact of harmonization is negligible and the two curves almost coincide.

  10. Note that the total number of departments varies across universities and possibly across exercises.

  11. By mapping we mean associating to each researcher, both in VQR1 and VQR2, a post-reform department (note also that researchers might have changed universities and/or department over the years). The easiest way to map old departments into new ones is to assign researchers assessed in both exercises the univocal affiliation utilized for the second VQR. However this procedure is incomplete, since a new department affiliation was still missing for 3934 researchers at the time of conclusion of VQR2 (2769 in VQR1—4.5% of the sample—and 1165 for VQR2—2.2% of the sample). This is due to delay in the completion of the reform, since some academics refused to choose a post-reform department and had to be forcefully assigned by rectors. For these cases we have proceeded as follow:

    • (a) in 3058 cases, we have analysed the flows of researchers within the same institution and departments from VQR1 to VQR2, and an academic has been automatically assigned to department \(d\) if more than half of her colleagues from VQR1 moved to department \(d\) in VQR2. In case of ambiguities (216 cases) we have randomly assigned these researchers to one of the possible destinations in VQR2;

    • (b) for 876 cases where affiliation for VQR1 was absent, we retained the researchers in the analysis of VQR2 only, and dropped them for VQR1.

  12. Equation (5) can be conceived as derived by the following auto-regressive process of the 1st order: \(s_{jt} = \alpha + \beta s_{jt - 1} + \varepsilon_{jt}\) (5′). If \(0 < \beta < 1\) the process exhibits mean-reversion, i.e. it converges to a long run equilibrium given by \(\bar{s} = \frac{\alpha }{(1 - \beta )}\). The coefficient \(\beta\) captures the degree of persistence and therefore \(\left( {1 - \beta } \right)\) measures the “speed of convergence” to the long-run distribution (which in this simple framework degenerates, with all units converging to the same value). More formally, \(s_{jt} = \alpha + \beta s_{jt - 1} + \varepsilon_{jt}\) can be rewritten as \(s_{jt} = \frac{\alpha }{1 - \beta } + \beta^{t} s_{j0} + \varepsilon_{jt} + \beta \varepsilon_{jt - 1} + \cdots\) by repeated substitution. If \(\varepsilon_{i}\) are iid, then \({\text{Var}}\left( {s_{j} } \right) = {\text{Var}}\left( {\varepsilon_{j} } \right)\left[ {1 + \beta + \beta^{2} + \cdots } \right] = \frac{{{\text{Var}}\left( {\varepsilon_{j} } \right)}}{{\left[ {1 - \beta } \right]}}\). Thus as \(\beta \to 0, s_{i} \to \alpha\) and \({\text{Var}}\left( {s_{j} } \right) \to {\text{Var}}\left( {\varepsilon_{j} } \right)\) reaching its lowest value. On the other extreme, when \(\beta \to 1\) Eq. (5′) describes a random walk, which makes it impossible to define expected moments. Given the structure of our data (the cross-sectional dimension—91—being much larger than the panel dimension—2) we cannot formally test the non-stationarity of our variable. Nevertheless, we can ensure the stationarity of our dependent variable by resorting to transformation depicted by Eq. (5).

  13. Our estimate is lower than that obtained by Buckle et al. (2020) (− 0.722) with a similar strategy, but they consider a small group of universities, a selection of research fields and a longer time span.

  14. The result cannot be attributed to movements of researchers across institutions between the two exercises, as mobility required the opening of a position and a local competition, which were rare during the period of assessment due to the hiring freeze imposed by the central government for budgetary reasons.

  15. It should be pointed out that in Italy universities are subject to annual limits concerning the number of professors that can hire or promote.

  16. In principle any candidate was free to apply wherever she aimed to go. But local competitions were often biased in favour of local candidates, and the selecting committees were formed according to this preferred outcome. See Checchi et al. (2020).

  17. Italian academics are pigeon-holed in 371 research field (settori scientifico-disciplinari), which are then grouped in 14 main research areas (aree CUN) which are used to aggregate the data shown in Fig. 7. Since VQR2 introduce the split of two areas (8 and 11), we have extended the comparison to these sub-areas.

References

  • Abramo, G., D’Angelo, C. A., & Rosati, F. (2016). The North-South divide in the Italian higher education system. Scientometrics,109(3), 2093–2117. https://doi.org/10.1007/s11192-016-2141-9.

    Article  Google Scholar 

  • Abramo, G., & D’Angelo, C. A. (2015). The VQR, Italy’s second national research assessment: methodological failures and ranking distortions. Journal of the Association for Information Science and Technology,66(11), 2202–2214. https://doi.org/10.1002/asi.23323.

    Article  Google Scholar 

  • Abramo, G., D’Angelo, C. A., & Di Costa, F. (2014). Inefficiency in selecting products for submission to national research assessment exercises. Scientometrics,98(3), 2069–2086. https://doi.org/10.1007/s11192-013-1177-3.

    Article  Google Scholar 

  • Ancaiani, A., Anfossi, A., Barbara, A., Benedetto, S., Blasi, B., Carletti, V., et al. (2015). Evaluating scientific research in Italy: the 2004–10 research evaluation exercise. Research Evaluation,24(3), 242–255.

    Article  Google Scholar 

  • Baccini, A. (2016). Napoleon and the bibliometric evaluation of research: Considerations on university reform and the action of the national evaluation agency in Italy. [Napoléon et l’évaluation bibliométrique de la recherche: Considérations sur la réforme de l’université et sur l’action de l’Agence Nationale d’évaluation en Italie]. Canadian Journal of Information and Library Science,40(1), 37–57.

    Google Scholar 

  • Baccini, A., & De Nicolao, G. (2016). Do they agree? Bibliometric evaluation versus informed peer review in the Italian research assessment exercise. Scientometrics,108(3), 1651–1671. https://doi.org/10.1007/s11192-016-1929-y.

    Article  Google Scholar 

  • Barro, R. J. (1997). Determinants of economic growth: A cross-country empirical study. Cambridge, MA: MIT Press.

    Google Scholar 

  • Bertocchi, G., Gambardella, A., Jappelli, T., Nappi, C. A., & Peracchi, F. (2015). Bibliometric evaluation vs. informed peer review: Evidence from Italy. Res Policy,44(2), 451–466. https://doi.org/10.1016/j.respol.2014.08.004.

    Article  Google Scholar 

  • Buckle, R. A., Creedy, J., & Gemmell, N. (2020). Is external research assessment associated with convergence or divergence of research quality across universities and disciplines? Evidence from the PBRF process in New Zealand. Appl Econ. https://doi.org/10.1080/00036846.2020.1725235.

    Article  Google Scholar 

  • Checchi, D., Ciolfi, A., De Fraja, G., Mazzotta, I., & Verzillo, S. (2019a). Have you read this? An empirical comparison of the British REF peer review and the Italian VQR bibliometric algorithm. CEPR Discussion Paper 13521/2019

  • Checchi, D., De Fraja, G., & Verzillo, S. (2020). Incentives and careers in academia: Theory and empirical analysis. The Review of Economics and Statistics. https://www.mitpressjournals.org/doi/abs/10.1162/rest_a_00916(forthcoming).

  • Checchi, D., Malgarini, M., & Sarlo, S. (2019b). Do performance-based research funding systems affect research production and impact? Higher Education Quarterly,73, 45–69.

    Article  Google Scholar 

  • Cicero, T., Malgarini, M., Nappi, C. A., & Peracchi, F. (2013). Bibliometric and peer review methods for research evaluation: a methodological appraisement. MPRA (Munich Personal REPEc Archive). Munich (in Italian).

  • Cuccurullo, F. (2006). La Valutazione Triennale della Ricerca - VTR del CIVR: bilancio di un’esperienza. Analysis-Rivista di cultura e politica scientifica,3–4, 5–7.

    Google Scholar 

  • Grisorio, M. J., & Prota, F. (2020). Italy’s national research assessment: Some unpleasant effects. Stud High Educ,45(4), 736–754. https://doi.org/10.1080/03075079.2019.1693989.

    Article  Google Scholar 

  • Viesti, G. (2016). (a cura di). Università in declino. Donzelli editore

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniele Checchi.

Additional information

The opinions expressed in the paper are personal and do not involve the institutions of affiliation.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Checchi, D., Mazzotta, I., Momigliano, S. et al. Convergence or polarisation? The impact of research assessment exercises in the Italian case. Scientometrics 124, 1439–1455 (2020). https://doi.org/10.1007/s11192-020-03517-2

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-020-03517-2

Keywords

Navigation