An analysis of in-text citations based on fractional counting

https://doi.org/10.1016/j.joi.2020.101070Get rights and content

Highlights

  • The majority of in-text citations were independent.

  • The majority of references that had no independent mentions were mentioned only once.

  • Approximately 20 % of the references did not independently contribute to the citing paper.

  • Most of the multiple mentioned references had high mention frequencies according to two counting methods.

Abstract

With the development of citation analysis, the analysis of in-text citations is getting more important. There can be many references in the bibliography of a paper, but the way that each reference is mentioned within the full text of a paper is different. Some references are mentioned together with other references, and some references are mentioned alone. That is, a citation sentence can include only one reference or several references. However, the citation sentence gives readers a description. Thus, it is necessary to examine in-text citations by considering the way that each reference is mentioned within the full text. From this point of view, we introduce two counting methods (full counting and fractional counting) to examine in-text citations and compare the two counting methods. The number of in-text citations according to full counting was approximately 1.448 times larger than that according to fractional counting. The results show that the majority of in-text citations are independent, and the majority of references that have no independent mentions are mentioned only once. The results also show that most of the multiple mentioned references have high mention frequencies according to both full counting and fractional counting.

Introduction

Citation counts have long been used as an evaluation measure not only for individual researchers but also for research groups, universities, journals, and other entities. However, traditional citation counts have a limitation in that they treat all the references equally, which makes the accuracy of the citation count as one of the measures of scientific impact uncertain. Many researchers, therefore, are getting more interested in the number of in-text citations based on the idea that all citations should not be treated equally.

Each paper has a long list of references, but the number of in-text citations (mention frequency) of each reference is different. Some references are mentioned several times within the full text of a paper, and some references are mentioned only once. Therefore, the number of in-text citations of a reference can give readers some valuable information. Herlach (1978) found that the number of in-text citations can represent the relevancy between papers. Voos and Dagaev (1976) indicated that the number of in-text citations of each reference was different and could be a good prediction feature. Hou, Li, and Niu (2011) reported that the number of in-text citations could represent the scientific contribution of each reference to the citing paper. Zhu, Turney, Lemire, and Vellino (2015) examined several features such as citation location, the number of in-text citations, and citation context, and reported that the number of in-text citations was a useful feature for representing the scientific contributions to a citing paper. Zhao, Cappello, and Johnston (2017) reported that the scientific contribution of each reference was likely to increase as the number of in-text citations increased. The above research results imply that the number of in-text citations is significant to evaluating the scientific contribution of a reference to the citing paper. Waltman (2016) reviewed the citation impact indicators and recommended the use of the number of in-text citations when designing citation impact indicators.

Since the number of in-text citations can significantly represent the scientific contribution to the citing paper, many researchers have tried to use it for developing citation impact indicators. Hou et al. (2011) replaced the total number of citations with the total number of in-text citations and used it to evaluate the scientific impact of a scientific paper. Wan and Liu (2014) used the total number of in-text citations instead of the total number of citations to develop the WL-index, which evaluates the scientific impact of an individual researcher. Zhu et al. (2015) designed a new citation impact indicator that uses the weight of N2, where N is the total number of in-text citations. However, replacing the total number of citations with the total number of in-text citations can often cause overweighting (Zhao & Strotmann, 2016; Zhao et al., 2017). Thus, several researchers have attempted to model this negative influence when using the number of in-text citations. Such approaches are performed in two ways: one is to remove all citations that are mentioned only once (Zhao & Strotmann, 2015; 2016) and the other is to remove some citations according to the citation locations (Hu, Chen, & Liu, 2015; Zhao et al., 2017). In many cases, the second mention of a multiple mentioned reference provides a supplementary explanation of its first-time mention within the full text (Hu, Lin, Sun, & Hou, 2017). Thus, removing some citations according to their citation locations may be better. However, it may not be true that all citations that are mentioned only once or are contained in a “background” section are perfunctory. This means that there is a need to examine in-text citations in detail.

Some references are mentioned alone within the full text of a paper, but some references are mentioned together with other references. However, the full credit has been allocated to each reference. Waltman (2016) reviewed fractional counting in which the citation scores of a publication were fractionally assigned to each author of the publication. Thus, the greater the number of authors of a publication, the smaller the citation score of each author. Similarly, it is natural to consider that the credit should be fractionally allocated to the references that are included in a citation sentence. This implies that it is necessary to consider whether each reference is mentioned alone or not. In fact, the purpose of a citation is to give readers some credit regarding the descriptions written by authors. Even though a citation sentence includes several references, the citation sentence represents one description. In the sense that a citation sentence represents one description, the credit of each citation sentence is equal to each other. From this point of view, it may be regarded that the contributions of the references that are mentioned alone are larger those that of the references that are mentioned together with other references.

In this paper, we introduce the fractional counting method to examine in-text citations and consider whether the references are mentioned alone in a citation sentence or not. We collected 7083 papers from 9 journals, used 2 counting methods (full counting (Waltman & van Eck, 2015) and fractional counting) to calculate the number of in-text citations and compared the two counting methods.

The rest of this paper consists of the following sections. The next section covers some preliminary matters related to this research and presents the research questions. In Section 3, we describe the collected data and methodology of this research. In section 4, we examine the distribution of in-text citations and compare fractional counting with full counting. Finally, we discuss some problems related to the usage of the mention frequency and potential future works.

Section snippets

Definitions and research questions

For a better understanding of this paper, we define some terminologies. An in-text citation of a reference is a mention of the reference within the full text (Boyack, van Eck, Colavizza, & Waltman, 2018). In our paper, the term “mention” is considered to have same meaning with the term “in-text citation”. A citation sentence is a sentence or phrase that includes the in-text citations of references in it. The number of references in a citation sentence is usually different. That is, a citation

Data collection and methodology

For this study, we selected 9 journals and collected 7083 papers from those 9 journals. The collected papers were all in the form of an “article”. These 7083 papers cited 240,156 references in total. The collected data are shown in Table 1.

In this study, we first examined the distribution of the in-text citations of references by considering the mention weight. For each paper, we count the two kinds of mention frequencies of each reference by using two counting methods: the full counting method

Distribution of in-text citations

Under the full counting method, non-independent mentions were counted repeatedly, and so the numbers of in-text citations for the two counting methods were different. The results are shown in Table 2. As you can see from Table 2, the number of in-text citations found when using the full counting method was approximately 1.448 times greater on average than that found using the fractional counting method.

That is, the number of in-text citations was over-counted in full counting compared to

Discussion and conclusion

Several previous works reported that the mention frequency is one of the most effective features that can represent contributions to a citing paper. Thus, there has been many research related to the number of in-text citations (Hou, Li, & Niu, 2011; Wan & Liu, 2014; Zhao et al., 2017; Zhu, Turney, Lemire, & Vellino, 2015). Especially, in a previous work, we only examined the distribution of references based on full counting (Pak, Yu, & Wang, 2018). That is, we did not consider whether a

Author contribution

Chol Myong Pak: Conceived and designed the analysis; Collected the data; Contributed data or analysis tools; Performed the analysis; Wrote the paper

Weibin Wang: Collected the data; Contributed data or analysis tools; Performed the analysis; Wrote the paper

Guang Yu: Conceived and designed the analysis; Wrote the paper

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant Nos. 71704035 and 71774041). Authors would like to express sincere thanks to the editors and reviewers of this paper.

References (19)

There are more references available in the full text version of this article.

Cited by (4)

  • Fine-grained citation count prediction via a transformer-based model with among-attention mechanism

    2022, Information Processing and Management
    Citation Excerpt :

    Moreover, methodology-oriented papers tend to be more frequently cited in the Method section of a citing paper (Hu, Chen, & Liu, 2013), and are likely to be more highly cited than other types of articles (Boyack, van Eck, Colavizza, & Waltman, 2018). In addition, the number of in-text citations, to some extent, is more representative regarding the contribution of a reference to the citing paper (Herlach, 1978; Pak, Wang, & Yu, 2020; Voos & Dagaev, 1976; Zhao & Strotmann, 2020). Therefore, we maintain that a fine-grained citation count prediction (FGCCP), which predicts citation count and location simultaneously, is of great significance to scientific research evaluation.

  • Disclosing the relationship between citation structure and future impact of a publication

    2022, Journal of the Association for Information Science and Technology
  • Analysis of in-text citation patterns in local journals for ranking scientific documents

    2021, DESIDOC Journal of Library and Information Technology
View full text