Lotka's distribution and distribution of co-author pairs’ frequencies

https://doi.org/10.1016/j.joi.2007.07.003Get rights and content

Abstract

The original Lotka's Law refers to single scientist distribution, i.e. the frequency of authors Ai with i publications per author is a function of i: Ai = f(i). However, with increasing collaboration in science and in technology the study of the frequency of pairs or triples of co-authors is highly relevant. Starting with pair distribution well-ordered collaboration structures of co-author pairs will be presented, i.e. the frequency of co-author pairs Nij between authors with i publications per author and authors with j publications per author is a function of i and j: Nij = f(i, j) using the normal count procedure for counting i or j. We have assumed that the distribution of co-author pairs’ frequencies can be considered to be reflection of a social Gestalt and therefore can be described by the corresponding mathematical function based on well-known general characteristics of structures in interpersonal relations in social networks. We have shown that this model of social Gestalts can better explain the distribution of co-author pairs than by a simple bivariate function in analogy to Lotka's Law. This model is based on both the Gestalt theory and the old Chinese Yin/Yang theory.

Introduction

Since several decades, collaboration is increasing in science and in technology. Usually, the bibliometric method for the study of collaboration is the investigation of co-authorships (de B. Beaver, 2001; de Solla Price, 1963; Glänzel, 2002; Glänzel & de Lange, 1997; Glänzel & Schubert, 2004; Luukkonen, Persson, & Silvertse, 1992; Miquel & Okubo, 1994; Newman, 2001; Okubo, Miquel, Frigoletto, & Doré, 1992; Tijssen & Moed, 1989; Zitt, Bassecoulard, & Okubo, 2000).

The original Lotka's Law (Lotka, 1926) refers to single scientist distribution, i.e. the frequency of authors Ai with i publications per author is a function of i:Ai=f(i)

However, with increasing collaboration in science and in technology the study of the frequency of pairs or triples of co-authors is highly relevant. Starting with pair distribution the frequency of co-author pairs Nij between authors with i publications per author and authors with j publications per author is a function of i and j:Nij=f(i,j)

Whereas regarding Lotka's Law single scientists P distribution (both in single authored and in multi-authored bibliographies) is of interest, in the future pairs P, Q distribution, triples P, Q, R distribution, etc., should be considered.

Starting with pair distribution, the following questions arise in the present paper:

  • -

    Is there any regularity for the distribution of co-author pairs’ frequencies?

  • -

    If yes, can the distribution of co-author pairs be described by an extension of Lotka's Law or can this distribution be better described by a model of social Gestalts?

Regarding the last question, two theoretical distributions will be calculated and these distributions will be further specified and discussed in the next sections.

However, in Section 1 we only intend to visualize the two theoretical distributions in comparison with an empirical distribution.

For visualization, Lotka's distribution and co-authorship distribution of pairs of collaborators obtained from the journal Science are presented in Fig. 1. The articles from 1980 to 1998 of the journal Science were studied with 47,117 authors and the total sum of Nij = 418,458 co-author pairs (method for counting Nij, cf. below).

Fig. 1 shows the distribution of the number of authors Ai with i publications per author (upper row) contrasted with the distribution of the number of co-author pairs Nij between authors with i publications per author and authors with j publications per author (lower row). For clarity's sake and optimum visualization the presentation of data are restricted to authors with at most 10 articles i (i = 1, 2, …, 10) or j (j = 1, 2, …, 10) respectively.

The upper row reflects the Lotka's Law. In this row, the distribution of the number of authors Ai with i publications per author is given on the left, and on the right the corresponding double logarithmic presentation. Ai (or log Ai respectively) is plotted at the Y-axis and i (or log i respectively) is plotted at the X-axis.

In the lower row, the distribution of the number of co-author pairs Nij is given on the left, and the corresponding triple logarithmic presentation on the right. The number of pairs Nij (or log Nij respectively) is plotted at the Z-axis, i (or log i respectively) is plotted at the X-axis and j (or log j respectively) is plotted at the Y-axis.

Fig. 1 has shown we can say yes to the first question that there is any regularity existing for the distribution of co-author pairs’ frequencies.

However, before considering the second question regarding the description of the distribution of co-author pairs’ frequencies [Nij = f(i, j)] information about the method of counting these co-author pairs is necessary.

Method for counting Ni and Nij (in extended form, cf. Appendix A): Given is an artificial bibliography including eight papers (names of authors: A, B, …).

The number of publications i (or j respectively) per author P (or Q respectively) is determined by resorting to the “normal count procedure”. Each time the name of an author appears, it is counted (e.g. A three times: once in the first paper, and once each in the 4th and 8th papers).

Pairs P, Q are marked in the cells of the matrix under the condition of both the first authors P count (i) and the second authors Q count (j), i.e. the authors are ordered according to i or j respectively in both the row and the column (cf. Table 1).

Under the condition, the place of the authors in the by-line is not taken into consideration the symmetrical matrix is resulting. For example, the pair G, A is marked two times: once under the condition G count (i) and A count (j) and once under the condition A count (i) and G count (j).

In the symmetrical matrix, one can determine for each author P the number of his collaborators NP. NP is equal to the Degree Centrality in Social Network Analysis (SNA).

The matrix of Nij (Table 2, derived from the symmetrical matrix) is the representation of the number of pairs Nij with authors who have i publications per author, with authors who have j publications per author included in the bibliography.

For example, the pairs E, D and F, D in Table 1 are counted both as N12 = 2 and N21 = 2 in the matrix of Nij.

As mentioned above in Section 1 we only intend to visualize both the theoretical pattern derived from the extension of Lotka's Law and the theoretical pattern of social Gestalt in comparison with the empirical distribution obtained from the journal Science.

Fig. 2 shows the comparison of the empirical distribution of log Nij with the two different distributions of theoretical values. The figures on the right are rotated by 90°.

The empirical distribution of the pairs’ frequencies in Science (Fig. 2, second row) is rather equal to the theoretical distribution of social Gestalts (third row) but different from the other, i.e. from the pattern by the extension of Lotka's Law (first row).

This is the first proof for the assumption the distribution of the co-author pairs’ frequencies Nij can be considered to be a social Gestalt. It is the key phrase of this paper stating a key result and it constitutes the principal method and the main point of the paper.

Both the extension of Lotka's Law and the meaning of Gestalt will be explained in the next sections followed by the mathematical function for the description of social Gestalts. This general theoretical function is valid for different kinds of social Gestalts. We will give the proof that the distribution of co-author pairs’ frequencies is one example of them. Four journals are studied.

For thorough explanation of the theoretical and methodological background of the studies in this paper, in most of the sections or paragraphs reference is given to the corresponding information in the annex.

Section snippets

Theoretical bivariate distribution derived from the extension of Lotka's Law

Qin (1995) showed in her example, that the number of collaborators Ni is distributed in the same way as the total number of publications of all authors with i publications per author (Ti):Ti=iAi

This means, that the marginal sums Ni (or Nj respectively) should be distributed according to an inverse power function in line with Lotka's Law, however, with a different parameter:Ni=constantiaorNj=constantjarespectively

Because of the symmetry of the matrix, both the distributions of the marginal

General remarks

In the wake of a tangible change of paradigm in science occuring by the end of the 20th century, a number of holistic theories have emerged (e.g. Bohm, 1980; Laszlo, 1997; Prigogine & Stengers, 1984; Sheldrake, 1988; Stapp, 1993, etc., just to mention only a few of them) which are operating on the idea of holographic interacting entities in the world, with several of them also implying a field concept.

For example:

  • -

    magnetic field in physics,

  • -

    morphogenetic field of living organisms in evolutionary

Remarks

The development of the mathematical function is based on both the old Chinese Yin/Yang theory and well-known general characteristics of structures in interpersonal relations in social networks (Kretschmer, 1999a, Kretschmer, 2002 and Appendix E). These general characteristics of social structures are already partly identifiable in groups of higher vertebrates.

One of these general characteristics of structures is well-known as proverb: “Birds of a feather flock together”. It means in the example

Hypotheses

We have shown the derivation of a function with four parameters and one constant for describing social or behavioural Gestalts in general (16).

For definition of hypotheses regarding the example of co-author pairs’ frequencies Nij, we have to specify the variables for the study of this kind of social Gestalts.

There is a conjecture by de Solla Price (1963), physicist and science historian, that the logarithm of the number of publications is of a higher degree of importance than the number of

Data

Articles of the following journals are studied from 1980–1998:

  • -

    of the journal Science with 47,117 authors and the total sum of Nij = 418,458 co-author pairs;

  • -

    of the journal Nature with 52,838 authors and the total sum of Nij = 581,698 co-author pairs;

  • -

    of the journal Proc Natl Acad Sci USA with 79,877 authors and the total sum of Nij = 704,032 co-author pairs;

  • -

    of the journal Phys Rev B Condensed Matter with 46,232 authors and the total sum of Nij = 544,006 co-author pairs.

Remarks

The methods for counting Ni and Nij are already presented in Section 1 as well as in Appendix A in extended form.

The first proof for the assumption that the distribution of co-author pairs’ frequencies can be considered to be a social Gestalt is given by visualization in Fig. 2 of Section 1. In this connection, the first proof is also given for the assumption the distribution of co-author pair's frequencies can be better described by a model of social Gestalts than by a simple bivariate

Discussion and proposal for future investigations

The original Lotka's Law refers to the distribution of author frequencies. However, with increasing collaboration in science and in technology the study of the frequency of pairs or triples of co-authors is highly relevant. Kretschmer and Kretschmer have shown that there are regularities existing for the well-ordered distribution of co-author pairs’ frequencies (Fig. 1). As far as Kretschmers’ know, there is not any other presentation of this kind of pattern available in the former literature

Acknowledgments

The authors wish to thank Holger Heitsch and Jan Johannes for programming and the reviewers for the helpful comments. Furthermore, the authors want to thank I.K. Ravichandra Rao for his important suggestions.

References (40)

  • P.V. Marsden

    Models and methods for characterizing the structural parameters of groups

    Social Networks

    (1981)
  • D. Bohm

    Wholeness and the implicate order

    (1980)
  • F. Capra

    Wendezeit

    Bausteine für ein neues Weltbild

    (1996)
  • D. de B. Beaver

    Reflections on scientific collaborations (and its study): Past, present and prospective

    Scientometrics

    (2001)
  • D. de Solla Price

    Little science, big science

    (1963)
  • L. Egghe et al.

    Duality revisited: constructiion of fractional frequency distributions based on two dual Lotka laws

    Journal of the American Society of Information Science and Technology

    (2002)
  • E.P. Fischer

    Das Schöne und das Biest

    Ästhetische Momente in der Wissenschaft

    (1997)
  • W. Glänzel

    Coauthorship patterns and trends in the sciences (1980–1998): A bibliometric study with implications for database indexing and search strategies

    Library Trends

    (2002)
  • W. Glänzel et al.

    Modeling and Measuring Multilateral Co-authorship in International Scientific Collaboration. Part II. A comparative study on the extent and change of international scientific collaboration links

    Scientometrics

    (1997)
  • W. Glänzel et al.

    Analyzing scientific networks through co-authorship

  • B. Grun

    Timetables of history

    (1975)
  • A. Hellemans et al.

    Timetables of science

    (1998)
  • H. Kretschmer

    A new model of scientific collaboration. Part I: Types of two-dimensional and three-dimensional collaboration patterns

    Scientometrics

    (1999)
  • H. Kretschmer

    Development of structures in coauthorship networks

  • H. Kretschmer

    Distribution of co-author couples in journals: “Continuation” of Lotka's Law on the 3rd dimension

  • H. Kretschmer

    Similarities and dissimilarities in co-authorship networks; Gestalt theory as explanation for well-ordered collaboration structures and production of scientific literature

    Library Trends

    (2002)
  • H. Kretschmer et al.

    Comparison of rules in bibliographic and in web networks

  • Kretschmer, H., & Kretschmer T. (2006a). Well-ordered collaboration structures of co-author pairs in journals. In: P....
  • H. Kretschmer et al.

    Well-ordered collaboration structures of co-author pairs in journals

  • H. Kretschmer et al.

    Lotka's distribution and distribution of co-author pairs

  • Cited by (16)

    • Who is collaborating with whom? Part I. Mathematical model and methods for empirical testing

      2015, Journal of Informetrics
      Citation Excerpt :

      N is equal to the total sum of degrees of all n nodes (all authors Fx) in a network, equal to the total sum of pairs. Distributions of this kind of co-author pairs’ frequencies (Nij) have already been published (Guo, Kretschmer, & Liu, 2008; Kretschmer & Kretschmer, 2007; Kundra, Beaver, Kretschmer, & Kretschmer, 2008). However, these distributions were restricted to imax = 31.

    View all citing articles on Scopus
    View full text