Transformations of basic publication–citation matrices
Section snippets
Introduction: the basic publication–citation matrix
A basic publication–citation matrix, in short: p–c matrix, is a table showing publication and citation data needed for the calculation of an informetric indicator such as a journal impact factor or an R-sequence (Liang, 2005). Examples of the use of p–c matrices can be found in (Frandsen & Rousseau, 2005; Ingwersen, Larsen, Rousseau, & Russell, 2001; Liang, 2005). Data in such matrices are usually shown in chronological order. Publication data may be given either row by row, as we will do in
Generalized impact factors and their relation with rhythm indicators
In Frandsen and Rousseau (2005), a general approach to the notion of an impact factor has been introduced. In this framework, an analysis based on several years of publications becomes possible. In traditional synchronous or diachronous citation studies only one row or one column of the citation submatrix is used. In a synchronous approach the citation period is fixed, in the diachronous approach the publication year (which is the cited year) stays fixed. The more general Frandsen–Rousseau
The R-transformation
This transformation on a p–c matrix M is denoted as R. It maps the matrix M to the p–c matrix R(M). It leaves the publication column P unchanged: the first column of R(M) is equal to P. R transforms the c-submatrix C with elements Ci,j into the c-submatrix of R(M) with elements Ri,j where
The first equation establishes the matrix elements Ri,j as a weighted sum of citations, the second one as the number of publications multiplied
The AV-transformation
The more (citable) articles a journal, an institute, etc. publishes the larger its citation potential. In order to take this effect into account one may use the following averaging transformation, denoted as AV. Applied to the matrix M this yields the matrix AV(M) = A. The elements of AV(M) are equal to the elements of M divided by Pi, the first element of the ith row of M:
In particular, we see that the first column of A consists of ones. The citation part of the AV-transformed matrix
The -transformation
The -transformation is the multiplicative analogue of the R-transformation. It maps the matrix M to the p–c matrix (M). too leaves the publication column P unchanged. Further, transforms the c-submatrix with elements Ci,j into the c-submatrix (C) with elements where
The -transformation leaves the product of all elements of the p–c matrix elements invariant. Hence, it can be considered as a multiplicative rearrangement of the p–c matrix.
Normalizing with respect to the size of the pool
The potential number of retrieved citations clearly depends on the size of the used database (pool). As the p–c matrix usually covers several periods it is often a good idea to normalize it with respect to the size of the pool during that period. Let Zj denote the size of the pool in interval Ij, j = 1, …, n. Then the size-normalized p–c matrix is denoted as N. With respect to the original case the publication column does not change, but Ci,j is transformed into Ni,j = Ci,j/Zj.
This transformation
Other types of p–c matrices
- A.
Instead of the matrix C, where Ci,j denotes the total number of citations received in the year j by articles published in a particular journal in the year i, one may also use the simpler matrix T, where Ti,j denotes the total number of articles published in this journal in the year i and cited in the year j (ignoring the precise number of citations each article received, hence Ti,j is equal to Pi minus the number of articles that are uncited in the year j).
- B.
Besides a ‘cited’ perspective, it is
p–c matrices with discrete steps
In the p–c matrices studied thus far rows and columns reflect time periods. Yet, in some studies it is also meaningful to organize the p–c matrix by discrete steps, more concretely: each row and column refers to exactly one article or one journal issue, presented in the order in which they are written or published. This approach is especially interesting in self-citation studies, see e.g. (Glänzel, Thijs, & Schlemmer, 2004). We consider the two examples of (1) all articles published by one
Discussion and conclusion
The use of basic publication–citation matrices for the construction of informetric indicators has been highlighted. Transforming these publication–citation matrices clarifies the construction of other indicators. Examples of such transformations, such as the R and the AV-transformation are presented. A distinction has been made between transformations using only elements of the given p–c matrix, and transformations using external elements, e.g. the size of the citation pool.
In the spirit of
Acknowledgements
The authors thank Leo Egghe for pointing out that the proof of Theorem 2 can easily be derived form Theorem 1. They also thank the anonymous referees for helpful observations. This work is sponsored by the National Natural Science Foundation of China (Project 70373055).
References (11)
Conglomerates as a general framework for informetric research
Information Processing and Management
(2005)- et al.
A history of mathematics
(1989) - et al.
Article impact calculated over arbitrary periods
Journal of the American Society for Information Science and Technology
(2005) - et al.
A bibliometric approach to the role of author self-citations in scientific communication
Scientometrics
(2004) - et al.
The publication–citation matrix and its derived quantities
Chinese Science Bulletin
(2001)
Cited by (4)
Reflections on the activity index and related indicators
2012, Journal of InformetricsMeasuring a journal's input rhythm based on its publication-reference matrix
2010, Journal of InformetricsUsing extended r-impact to assess journal influence
2009, IEEE Transactions on ReliabilityFundamental properties of rhythm sequences
2008, Journal of the American Society for Information Science and Technology