Abstract
News mentions are considered as useful source for measuring the societal impact of scholarly output, meanwhile data quality plays a fundamental role in its research and application. This study is aimed to measure the accuracy of news mentions data in the altmetrics database, in order to inform the reliability and limitation of relevant news altmetrics studies. In total, 5.83 million news mentions records that involve 1.03 million scholarly outputs were extracted from the whole dataset up to December 2019 provided by the Altmetric database. 3000 records were sampled for content analysis using stratified sampling strategy. Results show that: (1) 6 major types and 14 specific error types are identified. (2) Error occurs in 42.5% of the sample records, 27.1% could be attributable to the news platform and 15.4% could be attributable to the Altmetric database. (3) Inaccessibility to the source news article (25.9%), incorrect news link provided by the Altmetric database (6.9%) and inaccurate news mention (7.9%) are found to be the three most common error types. (4) 8.5% of the sample records have errors that would cause miscalculation and undermine the validity of studies based on the data, while 33.8% of the sample records have errors that would influence the reliability and reproductivity. (5) Underlying reasons for the errors are summarized and possible measures to improve the data quality are discussed in an in-depth and systematic way. These results suggest that although the Altmetric database has made great achievements in collecting news altmetrics data, the data collection can be further improved.
Similar content being viewed by others
References
Aduku, K. J., Thelwall, M., & Kousha, K. (2017). Do Mendeley reader counts reflect the scholarly impact of conference papers? An investigation of computer science and engineering. Scientometrics, 112(1), 573–581.
Bar-Ilan, J., Halevi, G., & Milojević, S. (2019). Differences between altmetric data sources—A case study. Journal of Altmetrics, 2(1), 1.
Dudo, A. (2015). Scientists, the media, and the public communication of science. Sociology Compass, 9, 761–775.
Erdt, M., Nagarajan, A., Sin, S., & Theng, Y. (2016). Altmetrics: An analysis of the state-of-the-art in measuring research impact on social media. Scientometrics, 109(2), 1117–1166.
Fang, Z., & Costas, R. (2020). Studying the accumulation velocity of altmetric data tracked by altmetric.com. Scientometrics, 123(2), 1077–1101.
Fox, C., Levitin, A., & Redman, T. (1994). The notion of data and its quality dimensions. Information Processing & Management, 30(1), 9–19.
Gamble, J. M., Traynor, R., Gruzd, A., Mai, P., Dormuth, C., & Sketris, I. (2018). Measuring the impact of pharmacoepidemiologic research using altmetrics: A case study of a CNODES drug-safety article. Pharmacoepidemiology and Drug Safety., 29(1), 93–102.
Haustein, S. (2016). Grand challenges in altmetrics: Heterogeneity, data quality and dependencies. Scientometrics, 108, 413–423.
Haustein, S., Bowman, T. D., & Costas, R. (2016). Interpreting ‘altmetrics’: Viewing acts on social media through the lens of citation and social theories. Theories of informetrics and scholarly communication (pp. 372–405). De Gruyter.
Kamenova, K. (2017). Media portrayal of stem cell research: Towards a normative model for science communication. Asian Bioethics Review, 9(3), 199–209.
Kiernan, V. (2003). Diffusion of news about research. Science Communication, 25(1), 3–13.
Konkiel, S. (2020). Assessing the impact and quality of research data using altmetrics and other indicators. Scholarly Assessment Reports, 2(1), 13.
NISO. (2016). Outputs of the NISO alternative assessment project. https://www.niso.org/publications/rp-25-2016-altmetrics
Ortega, J. L. (2018). Reliability and accuracy of altmetric providers: A comparison among Altmetric.com, PlumX and Crossref Event Data. Scientometrics, 116, 2123–2138.
Ortega, J. L. (2019). Availability and audit of links in altmetric data providers: link checking of blogs and news in Altmetric.com, Crossref event data and PlumX. Journal of Altmetrics, 2(1), 4.
Ortega, J. L. (2020a). Altmetrics data providers: A meta-analysis review of the coverage of metrics and publication. Profesional De La Información. https://doi.org/10.3145/epi.2020a.ene.07
Ortega, J. L. (2020b). Blogs and news sources coverage in altmetrics data providers: A comparative analysis by country, language, and subject. Scientometrics, 122(1), 555–572.
Phillips, D. P., Kanter, E. J., Bednarczyk, B., & Tastad, P. L. (1991). Importance of the lay press in the transmission of medical knowledge to the scientific community. New England Journal of Medicine, 325(16), 1180–1183.
Poliakoff, E., & Webb, T. L. (2007). What factors predict scientists’ intentions to participate in public engagement of science activities? Science Communication, 29(2), 242–263.
Priem, J., & Hemminger, B. M. (2010). Scientometrics 2.0: Toward new metrics of scholarly impact on the social Web. First Monday. https://doi.org/10.5210/fm.v15i7.2874
Siravuri, H. V., & Alhoori, H. (2017). What makes a research article newsworthy? Proceedings of the Association for Information Science and Technology, 54(1), 802–803.
Sugimoto, C. R., Work, S., Larivière, V., & Haustein, S. (2017). Scholarly use of social media and altmetrics: A review of the literature. Journal of the Association for Information Science and Technology, 68(9), 2037–2062.
Wooldridge, J., & King, M. B. (2019). Altmetric scores: An early indicator of research impact. Journal of the Association for Information Science and Technology, 70(3), 271–282.
Yu, H., Biegzat, M., Li, L., & Xiao, T. (2021). How accurate are Twitter and Facebook altmetrics data? A comparative content analysis. Scientometrics, 126(5), 4437–4463.
Yu, H., Cao, X., Xiao, T., & Yang, Z. (2020a). How accurate are policy document mentions? A first look at the role of altmetrics database. Scientometrics, 125(2), 1517–1540.
Yu, H., Cao, X., & Wang, Y. (2020b). Research on the distribution characteristics of News altmetrics. Journal of the China Society for Scientific and Technical Information, 39(10), 1081–1092.
Zahedi, Z., Bowman, T. D., & Haustein, S. (2014a). Exploring data quality and retrieval strategies for Mendeley reader counts. Presented at the SIG/MET workshop, ASIS&T 2014 annual meeting, Seattle. http://www.asis.org/SIG/SIGMET/data/uploads/sigmet2014/zahedi.pdf
Zahedi, Z., & Costas, R. (2018). General discussion of data quality challenges in social media metrics: Extensive comparison of four major altmetric data aggregators. PLoS ONE, 13(5), e0197326. https://doi.org/10.1371/journal.pone.0197326
Zahedi, Z., Fenner, M., & Costas, R. (2014b). How consistent are altmetrics providers? Study of 1000 PLOS ONE publications using the PLOS ALM, Mendeley and Altmetric.com APIs. In altmetrics14: Expanding impacts and metrics An ACM web science conference 2014 Altmetrics Workshop, 23–26 June, Indiana University, Indiana, USA. https://doi.org/10.6084/m9.figshare.1041821
Zahedi, Z., & Haustein, S. (2018). On the relationships between bibliographic characteristics of scientific documents and citation and Mendeley readership counts: A large-scale analysis of Web of Science publications. Journal of Informetrics, 12(1), 191–202.
Acknowledgements
The research is supported by Humanity and Social Science Foundation of Ministry of Education of China (18YJC870023), National Natural Science Foundation of China (No. 71804067). The authors would like to thank Altmetric.com for providing access to the data.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yu, H., Yu, X. & Cao, X. How accurate are news mentions of scholarly output? A content analysis. Scientometrics 127, 4075–4096 (2022). https://doi.org/10.1007/s11192-022-04382-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-022-04382-x