General evolutionary theory of information production processes and applications to the evolution of networks
Introduction
The informetrics of information production processes (IPPs) can be described via the so-called size-frequency function f:where f (j) denotes the density of the sources in item density j: this is the continuous extension for the classical f (j) = number of sources with j items and we let j ≥ a ≥ 1 also be limited to a maximal item density ρm (see also Egghe, 2005 but there a = 1; here we use a general a > 0 since we have an application of this case—see further). A classical example is the law of Lotka, where f is then a decreasing power law; this case will also be considered after the general theory.
The size-frequency function f is equivalent with the rank-frequency function g:where f and g are related aswhere g−1 denotes the inverse function of g. It is clear from (3) that g(r) denotes the item density in the source on rank density r: this is the continuous extension of the discrete rank-frequency function where g(r) = number of items in the source on rank r and where T denotes (also in the continuous setting) the total number of sources.
The equivalence of the functions f and g is seen as follows: (3) yields g−1 (hence g), given f and it follows from (3) thatfor all j ∈ [a, ρm], hence f follows from g, showing the equivalency (see also Egghe, 2005). It is also well-known (see Egghe, 2005 or Egghe and Rousseau, 1990) that, in case f is a decreasing power law (i.e. Lotka's law), g is the so-called law of Mandelbrot (which we will describe in detail below).
In Egghe (2004), see also Egghe (2003) one studies positive reinforcement of IPPs, where one applies a transformation φ on the function g, i.e. g is transformed into g* = φ°g, where φ has certain properties, e.g. φ(x) ≥ x for all x and φ strictly increasing. In Egghe, 2003, Egghe, 2004, the connection of positively reinforced IPPs with linear 3-dimensional informetrics (i.e. the composition of 2 IPPs) is highlighted and the concentration properties of these positively reinforced IPPs are indicated using the theorem of Fellman and Jakobsson (see Fellman, 1976; Jakobsson, 1976; see also Egghe, 2007).
In Egghe, 2003, Egghe, 2004 and Egghe and Rousseau (2006a), a transformation of g in the following sense has been studied:with B, c > 1 (i.e. φ(x) = Bxc) yielding, for Lotkaian IPPs, lower Lotka exponents. In Egghe and Rousseau (2006a) the extra generalization j ∈ [a,ρm] with a ≥ 1 is used. In this case the transformation (5) not only leads to lower Lotka exponents but also to higher minimum density values a > 1. This, in turn, gives a rationale for systems in which sources do not have a low number of items as is the case for database sizes or country or city sizes. That these cases go together with low values of the exponent of the Lotka function has been experimentally verified in Egghe and Rousseau (2006a).
Discussions with Cothey (July 2005) revealed that an extra generalization of the above formalism (essentially the transformation g → φ°g) is needed. Indeed, the transformation φ is a transformation that applies on the item densities j = g(r) but leaves the source rank densities unchanged. Cothey informed us that the framework of IPPs is well applied to networks (where sources are nodes and items are hyperlinks: in- or outlinks) but that a model is needed, e.g. to describe disappearing sources (nodes)—of course still allowing for disappearing items as well. Of course the creation of sources and items should also be covered.
In view of the above it is clear what to do: the transformation φ above, in its full generality, works well to describe changes (dynamics) of items. So “all we have to do” is to introduce another transformation, called ψ below, in order to describe the changes (dynamics) of the sources.
In the next section the second transformation ψ will act on the rank densities r. So, instead of the transformationwe will generalise (6) as follows:so that also the source rankings are transformed. This very general model (7) will be studied in the next section and its equivalent size-frequency function f* will be calculated.
In Section 3, the results obtained will be applied to Lotkaian systems and to transformations φ and ψ of power law type. Also in this case the equivalent size-frequency function f* will be calculated thereby extending the results in Egghe, 2003, Egghe, 2004 and Egghe and Rousseau (2006a).
Several applications of these results are described in Section 4. The applications go from general IPPs to countries or city size distributions, database distributions or (as initiated by Cothey) network distributions and their dynamics (evolutions).
Section snippets
General evolutionary model for IPPs
Let us have a first system (IPP) given by , as size-frequency function and by its equivalent (cf. (3), (4)) rank-frequency function . Suppose this system is “changing” into a new system that we describe by asterisks: and .
To allow for the largest possible freedom of evolution of the first IPP into the second we allow for a transformation of the source densities as well as of the item densities as follows: we define
Power law transformations in Lotkaian IPPs
Now the obtained results will be applied to Lotkaian IPPs where φ and ψ are transformations of power law type.
Lotkaian IPPs are IPPs where we have a decreasing power law for the size-frequency function f:
C > 0 and α > 1 constants, j ∈ [a,ρm]. Case 3.1 As proved in Egghe and Rousseau (1990) – see also Egghe (2005) – we now have for the rank-frequency function g (equivalent to f in (18)):withρm < ∞
Note that the value of a is only implicitly involved in (19), being the lowest
Applications
- 1.
No sources are destroyed or created but one has that items can be destroyed (example: no nodes in a network are destroyed or created but one has the destruction of some in-links). Here A = b = 1 in (23), clearly. We can assume that the destruction of items follows a random sample in the items, hence sources with a large number of items have a higher probability for an item deletion, the probability being proportional to the source's size. This implies c = 1, 0 < B < 1 in (24) (B being 1- sample
Acknowledgements
The author is grateful to Prof. Dr. V. Cothey for mentioning the problem of modelling network evolutions and to profs. Dr. V. Cothey and R. Rousseau for interesting discussions on the topic of this paper. The author is also grateful to two anonymous referees for giving good suggestions to improve this paper, both in content and in style.
References (10)
- et al.
Systems without low-productive sources
Information Processing and Management
(2006) On the measurement of the degree of progression
Journal of Public Economics
(1976)- et al.
The size distribution of cities: An examination of the Pareto law and primacy
Journal of Urban Economics
(1980) Positive reinforcement and 3-dimensional informetrics
Positive reinforcement and 3-dimensional informetrics
Scientometrics
(2004)