Computer models for identifying instrumental citations in the biomedical literature

Fu, Lawrence D.; Aphinyanaphongs, Yindalon; Aliferis, Constantin F.

doi:10.1007/s11192-013-0983-y

Computer models for identifying instrumental citations in the biomedical literature

Published: 27 February 2013

Volume 97, pages 871–882, (2013)
Cite this article

Scientometrics Aims and scope Submit manuscript

Lawrence D. Fu¹,
Yindalon Aphinyanaphongs¹ &
Constantin F. Aliferis²

529 Accesses
4 Citations
Explore all metrics

Abstract

The most popular method for evaluating the quality of a scientific publication is citation count. This metric assumes that a citation is a positive indicator of the quality of the cited work. This assumption is not always true since citations serve many purposes. As a result, citation count is an indirect and imprecise measure of impact. If instrumental citations could be reliably distinguished from non-instrumental ones, this would readily improve the performance of existing citation-based metrics by excluding the non-instrumental citations. A citation was operationally defined as instrumental if either of the following was true: the hypothesis of the citing work was motivated by the cited work, or the citing work could not have been executed without the cited work. This work investigated the feasibility of developing computer models for automatically classifying citations as instrumental or non-instrumental. Instrumental citations were manually labeled, and machine learning models were trained on a combination of content and bibliometric features. The experimental results indicate that models based on content and bibliometric features are able to automatically classify instrumental citations with high predictivity (AUC = 0.86). Additional experiments using independent hold out data and prospective validation show that the models are generalizeable and can handle unseen cases. This work demonstrates that it is feasible to train computer models to automatically identify instrumental citations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Why, When, Who, What, How, and Where for Trainees Writing Literature Review Articles

Article 21 May 2019

Gerry L. Koons, Katja Schenke-Layland & Antonios G. Mikos

How to Write and Publish a Research Paper for a Peer-Reviewed Journal

Article Open access 30 April 2020

Clara Busse & Ella August

Literature reviews as independent studies: guidelines for academic practice

Article Open access 14 October 2022

Sascha Kraus, Matthias Breier, … João J. Ferreira

References

Aliferis, C. F., Statnikov, A., & Tsamardinos, I. (2006). Challenges in the analysis of mass-throughput data. Cancer Informatics, 2, 133–162.
Google Scholar
Aliferis, C. F., Statnikov, A., Tsamardinos, I., et al. (2010). Local causal and markov blanket induction for causal discovery and feature selection for classification part I: Algorithms and empirical evaluation. Journal of Machine Learning Research, 11, 171–234.
MathSciNet MATH Google Scholar
Aphinyanaphongs, Y., Tsamardinos, I., Statnikov, A., et al. (2005). Text categorization models for high-quality article retrieval in internal medicine. Journal of the American Medical Informatics Association, 12, 207–216.
Article Google Scholar
Bornmann, L., & Daniel, H. (2007). What do citation counts measure? A review of studies on citing behavior. Journal of Documentation, 64(1), 45–80.
Google Scholar
Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 30, 107–117.
Article Google Scholar
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46.
Article Google Scholar
Cronin, B. (1998). Metatheorizing citation. Scientometrics, 43, 45–55.
Article Google Scholar
Egashira, K., Inou, T., Hirooka, Y., et al. (1993). Evidence of impaired endothelium-dependent coronary vasodilatation in patients with angina pectoris and normal coronary angiograms. New England Journal of Medicine, 328, 1659–1664. doi:10.1056/nejm199306103282302.
Article Google Scholar
Fu, L. D., & Aliferis, C. F. (2010). Using content-based and bibliometric features for machine learning models to predict citation counts in the biomedical literature. Scientometrics, 85, 257–270.
Article Google Scholar
Garfield, E. (1962). Can citation indexing be automated? Essays of an Information Scientist, 1, 84–90.
Google Scholar
Hecht, S. S., Carmella, S. G., Murphy, S. E., et al. (1993). A tobacco-specific lung carcinogen in the urine of men exposed to cigarette smoke. New England Journal of Medicine, 329, 1543–1546. doi:10.1056/nejm199311183292105.
Article Google Scholar
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174.
Article MathSciNet MATH Google Scholar
Leopold, E., & Kindermann, J. (2002). Text categorization with support vector machines. Machine Learning, 46, 423–444.
Article MATH Google Scholar
MacRoberts, M. H., & MacRoberts, B. R. (1996). Problems of citation analysis. Scientometrics, 36, 435–444.
Article Google Scholar
Mercer, R. E., DiMarco, C. (2004). A design methodology for a biomedical literature indexing tool using the rhetoric of science. In 2004 Joint Conference on Human Language Technology/North American Association for Computational Linguistics (HLT-NAACL).
Nicolaisen, J. (2003). The Social Act of Citing: Towards New Horizons in Citation Theory. In Proceedings of the 66th ASIST Annual Meeting 12–20.
Phelan, T. J. (1999). A compendium of issues for citation analysis. Scientometrics, 45, 117–136.
Article Google Scholar
Porter, M. F. (1980). An algorithm for suffix stripping. Program, 14, 130–137.
Article Google Scholar
Seglen, P. O. (1998). Citation rates and journal impact factors are not suitable for evaluation of research. Acta Orthopaedica Scandinavica, 69, 224–229.
Article Google Scholar
Teufel, S., Siddharthan, A., & Tidhar, D. (2006). Automatic classification of citation function. In Proceedings of EMNLP.

Download references

Acknowledgments

The authors gratefully acknowledge support from R56 LM007948-04A1 and 1UL1RR029893.

Author information

Authors and Affiliations

Department of Medicine, Center for Health Informatics and Bioinformatics, New York University Medical Center, 227 E 30th Street, 7th Floor, New York, NY, 10016, USA
Lawrence D. Fu & Yindalon Aphinyanaphongs
Department of Pathology, Center for Health Informatics and Bioinformatics, New York University Medical Center, 227 E 30th Street, 7th Floor, New York, NY, 10016, USA
Constantin F. Aliferis

Authors

Lawrence D. Fu
View author publications
You can also search for this author in PubMed Google Scholar
Yindalon Aphinyanaphongs
View author publications
You can also search for this author in PubMed Google Scholar
Constantin F. Aliferis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lawrence D. Fu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fu, L.D., Aphinyanaphongs, Y. & Aliferis, C.F. Computer models for identifying instrumental citations in the biomedical literature. Scientometrics 97, 871–882 (2013). https://doi.org/10.1007/s11192-013-0983-y

Download citation

Received: 28 November 2012
Published: 27 February 2013
Issue Date: December 2013
DOI: https://doi.org/10.1007/s11192-013-0983-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Computer models for identifying instrumental citations in the biomedical literature

Abstract

Access this article

Similar content being viewed by others

Why, When, Who, What, How, and Where for Trainees Writing Literature Review Articles

How to Write and Publish a Research Paper for a Peer-Reviewed Journal

Literature reviews as independent studies: guidelines for academic practice

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Computer models for identifying instrumental citations in the biomedical literature

Abstract

Access this article

Similar content being viewed by others

Why, When, Who, What, How, and Where for Trainees Writing Literature Review Articles

How to Write and Publish a Research Paper for a Peer-Reviewed Journal

Literature reviews as independent studies: guidelines for academic practice

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation