Elsevier

Journal of Informetrics

Volume 8, Issue 4, October 2014, Pages 972-984
Journal of Informetrics

Distributions of citations of papers of individual authors publishing in different scientific disciplines: Application of Langmuir-type function

https://doi.org/10.1016/j.joi.2014.09.009Get rights and content

Highlights

  • Citation distributions of papers of authors working in different fields are studied.

  • Langmuir-type (LT) function describes citation distributions satisfactorily.

  • Parameters K, Nc and α of LT function are mutually related.

  • Quantity KNc = (KNc)0/(α  1) where proportionality constant (KNc)0 < 1.

  • Value of constant (KNc)0 is related to the nature of sources of items (citations).

Abstract

The distribution of cumulative citations L and contributed citations Lf to individual multiauthored papers published by selected authors working in different scientific disciplines is analyzed and discussed using Langmuir-type function: yn = y0[1  αKn/(1 + Kn)], where yn denotes the total number of normalized cumulative citations ln* and normalized contributed citations lnf* received by individual papers of rank n, y0 is the maximum value of yn when n = 0, α  1 is an effectiveness parameter, and K is the Langmuir constant related to the dimensionless differential energy Q = ln(KNc), with Nc as the number of papers receiving citations. Relationships between the values of the Langmuir constant K of the distribution function, the number Nc of papers of an individual author receiving citations and the effectiveness parameter α of this function, obtained from analysis of the data of rank-size distributions of the authors, are investigated. It was found that: (1) the quantity KNc obtained from the real citation distribution of papers of various authors working in different disciplines is inversely proportional to (α  1) with a proportional constant (KNc)0 < 1, (2) the relation KNc = (KNc)0/(α  1) also holds for the citation distribution of journals published in countries of two different groups, investigated earlier (Sangwal, K. (2013). Journal of Informetrics, 7, 487–504), and (3) deviations of the real citation distribution from curves predicted by the Langmuir-type function are associated with changing activity of sources of generation of items (citations).

Introduction

Distributions of the number of items such as citations, papers, authors, journals and journal impact factors by their rank and size is an important research area in informetrics (Bornmann and Daniel, 2009, Egghe and Waltman, 2011Egghe, 2009, Egghe, 2011, Egghe, 2013, Guerrero-Bote et al., 2007Kretschmer and Rousseau, 2001, Lancho-Barrantes et al., 2010Leherrere and Sornette, 1998, Perc, 2010, Radicchi et al., 2008Redner, 1998, Redner, 2005, Tsallis and de Albuquerque, 2000, Vieira and Gomes, 2010, Wallace et al., 2009). In the literature various laws (e.g. Lotka's and Zipf's laws) and functions have been proposed to describe these informetric distributions and to explain the mechanism underlying their occurrence. The approaches used in the investigations of rank and size distributions of citations may be classified in the following categories:

  • (1)

    Theoretical studies of modeling of citation behavior carried out using preselected mathematical functions to generate citations (Burrell, 2001, Burrell, 2002, Burrell, 2013, Egghe, 2009, Egghe, 2013, Kretschmer and Rousseau, 2001, Nadarajah and Kotz, 2007). However, the main limitation of these functions is that they contain adjustable empirical parameters.

  • (2)

    Empirical studies devoted to the analysis of a dataset of citation distributions, constructed over a selected time window or a long period of time for a single discipline, speciality or journal, carried out using known mathematical functions (Bornmann and Daniel, 2009, Clauset et al., 2009Companario, 2010, Perc, 2010, Radicchi et al., 2008, Redner, 1998, Redner, 2005, Vieira and Gomes, 2010, Wallace et al., 2009). The main limitation of these functions is that they result in poor fitting at very high or very low rank (Mansilla et al., 2007, Naumis and Cocho, 2007, Sangwal, 2013a).

  • (3)

    Phenomenological approach used to describe citation data in terms of theoretical equations based on specific microscopic models (Barabasi and Albert, 1999, Guerrero-Bote et al., 2007, Gupta et al., 2008; Mansilla et al., 2007, Naumis and Cocho, 2007, Price, 1965, Price, 1976, Sangwal, 2013a, Sangwal, 2013b, Simkin and Roychowdhury, 2007, Tsallis and de Albuquerque, 2000, Vieira and Gomes, 2010, Wallace et al., 2009). The advantage of this approach is that the mathematical functions contain parameters attributed to some physical processes.

Sangwal (2013a) analyzed citation distributions of papers of different selected authors using five mathematical functions. The main conclusion drawn from the above study is that Zipf-type power law and logarithmic function previously proposed by Guerrero-Bote et al. (2007) for their iceberg hypothesis are inadequate to describe the citation distribution of individual papers of the authors, and that the new stretched exponential, Langmuir-type and empirical binomial mathematical functions can be employed to analyze citation distributions. In a later paper, Sangwal (2013b) analyzed distributions of citations L, two- (IF2) and five-year impact factors (IF5), and citation half-lives λ of journals published in different selected countries using Langmuir-type relation. It was found that the general features of the rank-order distributions of L, IF2 or IF5 of the journals published in different individual countries are similar to those of the citation distribution of papers of individual authors, and that the product of the Langmuir constant K and the number N of journals for the processes of citations and two- and five-year impact factors of journals published in different countries is constant for a process.

Size- and rank-order distributions may be defined in terms of information production processes (Egghe & Rousseau, 1990). An information production process consists of sources which produce or have items. A country in which N published journals receive L citations, a journal or a discipline in which N published papers receive L citations and an author publishing N papers which receive L citations since their inception are typical examples of information production processes. An information production process may be considered as a system or set of sources generating items. Sangwal, 2013a, Sangwal, 2013b derived his equations of the rank-order distribution of items generated by individual sources of the same activity using the concepts of adsorption processes involved during crystal growth.

It is well known that the average number of citations per paper differs among various scientific disciplines due to their citation behavior (for example, see: Abramo et al., 2012, Alonso et al., 2009, Hirsch, 2005, Iglesias and Pecharroman, 2007, Lundberg, 2007, Podlubny, 2005, Radicchi et al., 2008). Therefore, the total number of citations or any index such as the Hirsch index h based on citation distribution cannot be used to compare the research performance of researchers working in different scientific disciplines. Using the total number of citations (Iglesias and Pecharroman, 2007, Lundberg, 2007) or distributions of citations to the papers published in different fields (Abramo et al., 2012, Alonso et al., 2009, Podlubny, 2005, Radicchi et al., 2008) different scaling parameters have been proposed to compare the research output of researchers in various fields. Several studies have reported the average number of citations in different fields as an effective scaling parameter (Abramo et al., 2012, Alonso et al., 2009, Bornmann and Daniel, 2009, Hirsch, 2005, Iglesias and Pecharroman, 2007, Podlubny, 2005, Radicchi et al., 2008) but median or geometric mean of citations has also been proposed (Lundberg, 2007).

All citation-related measures for the publication output of an author assume that the papers receiving citations are written by him/her alone. However, since various papers published by an author in most disciplines are multiple-authored, it is illogical to award full credit to each author. The above measures in reality penalize authors who publish alone in comparison with others publishing with a large number of coauthors, juniors or seniors. In view of frequently unknown contributions of different authors in multiauthored papers devising of a fair method of counting of contributions of individual authors in multiauthored papers has drawn considerable attention for over three decades (for example, see: Assimakis and Adam, 2010, Batista et al., 2006Hodge and Greenberg, 1981, Pereira de Araújo, 2008, Price, 1981, Tol, 2011, Vinkler, 1993).

A survey of the published literature shows that analysis of the distribution of citations received by multiauthored papers published by individual authors using physical models has drawn relatively poor attention so far. Moreover, until now no study has been devoted to the investigation of the distribution of citations contributed by individual authors to their different multiauthored papers and comparison of distributions of cumulative citations and contributed citations to every multiauthored papers published by individual authors. The present paper is addressed to this topic. The aim of the paper is two-fold: (1) to analyze the distribution of cumulative citations L and contributed citations Lf to every multiauthored papers published by individual authors working in different scientific disciplines using the newly proposed Langmuir-type function, and (2) to investigate the relationship between the Langmuir constant K of the distribution function, the number Nc of papers of an individual author receiving citations and the effectiveness parameter α of this function.

Section snippets

Basic concepts of Langmuir-type function

Sangwal, 2013a, Sangwal, 2013b reported the Langmuir-type function of rank-order distribution of items following the concepts of adsorption processes involved during crystal growth and the basic concepts used in its derivation. The basic concepts used in the derivation of this function are briefly described below.

The derivation of the Langmuir-type function is based on the following postulates:

  • (1)

    There are N0 sources which have the same number smax of possible active sites.

  • (2)

    The maximum number of

Bibliometric data and their analysis

For the analysis we used the bibliometric data up to 2012 for selected authors who were nominated professors in seven scientific disciplines (namely: chemistry, physics, mathematics, biology, medical sciences, technical sciences and agricultural science) by the President of Poland during May and October 2013. The Thomson Reuters’ Web of Knowledge database was employed to collect the basic data. In each scientific discipline 5–6 authors were selected from the lists of nominated professors

Citation distribution of professors of different disciplines

Typical examples of the rank-order distribution of the cumulative citations l* and contributed citations lf* of individual papers for the various professors nominated in different disciplines are presented in Fig. 2, Fig. 3, Fig. 4. Fig. 2, Fig. 3, Fig. 4 show the dependence of citations l* and lf* for 7 professors nominated in physics, 6 professors nominated in technical sciences and 5 professors nominated in agriculture sciences on rank n of their papers. It may be seen that the lf*(n) data

Relationship between Langmuir constant K, number Nc of cited papers and effectiveness parameter α

Sangwal (2013b) discussed earlier the relationship between the Langmuir constant K obtained from the distributions of citations ln of journals published in different countries as a function of their rank n and their total number N. Since every nth journal receives citations ln here, N = Nc. It was found that the Langmuir constant K decreases linearly with increasing number N of journals published in a country, following the relation: ln(KN)  2.5 and 1.0 for citations and for two- and five-year

Nonuniformity in citation distributions

The Langmuir-type function (4) is based on the postulate that all sources producing items have the same activity which results in a smooth dependence of citations L received by Nc papers as a function of their rank (see Fig. 1 and curves in Fig. 2, Fig. 3, Fig. 4). However, it is observed that in some cases there are significant deviations between the best-fit plot according to the Langmuir-type function and the bibliometric citation data. Two examples are the citation data for AMa and RLe

Summary and conclusions

The distribution of cumulative citations L and contributed citations Lf to individual multiauthored papers published by authors working in different scientific disciplines is analyzed using Langmuir-type function (3) which is based on the concept of information production processes involving similar activity of sources producing items (e.g. citations received by papers acting as sources). It is shown that Langmuir-type function satisfactorily describes the dependence of the total number of

Acknowledgement

The author expresses his gratitude to Dr. Kazimierz Wójcik for his assistance with the collection of the data used in this work.

References (45)

  • M.L. Wallace et al.

    Modeling a century of citation distributions

    Journal of Informetrics

    (2009)
  • G. Abramo et al.

    Revisiting the scaling of citations for research assessment

    Journal of Informetrics

    (2012)
  • S. Alonso et al.

    h-Index: A review focused in its variants, computation and standardization for different scientific fields

    Journal of Informetrics

    (2009)
  • N. Assimakis et al.

    A new author's productivity index: p-index

    Scientometrics

    (2010)
  • A.L. Barabasi et al.

    Emerging of scaling in random networks

    Science

    (1999)
  • P.D. Batista et al.

    Is it possible to compare researchers with different scientific interests?

    Scientometrics

    (2006)
  • L. Bornmann et al.

    Universality of citation distribution – A validation of Radicchi et al.’s relative indicator cf = c/c0 at the micro level using data from chemistry

    Journal of the American Society for Information Science and Technology

    (2009)
  • Q.L. Burrell

    Stochastic modeling of the first-citation distribution

    Scientometrics

    (2001)
  • Q.L. Burrell

    The nth-citation distribution and oblescence

    Scientometrics

    (2002)
  • A. Clauset et al.

    Power-law distributions in empirical data

    SIAM Review

    (2009)
  • J.M. Companario

    Distribution of ranks of articles and citations in journals?

    Journal of the American Society for Information Science and Technology

    (2010)
  • L. Egghe

    A rationale for the Hirsch-index rank-order distribution and a comparison with the impact factor rank-order distribution?

    Journal of the American Society for Information Science and Technology

    (2009)
  • Cited by (3)

    View full text