Measures of statistical dispersion based on Shannon and Fisher information concepts
Introduction
In recent years, information-based measures of randomness (or “regularity”) of signals have gained popularity in various branches of science [2], [3], [10], [12]. In this paper we construct measures of statistical dispersion based on Shannon and Fisher information concepts and we describe their properties and mutual relationships. The effort was initiated in [20], where the entropy-based dispersion was employed to quantify certain aspects of neuronal timing precision. Here we extend the previous effort by taking into account the concept of Fisher information (FI), which was employed in different contexts [5], [12], [33], [37], [38]. In particular, FI about the location parameter has been employed in the analysis of EEG [25], [37], of the atomic shell structure [32] (together with Shannon entropy) or in the description of variations among the two-electron correlated wavefunctions [14].
The goal of this paper is to propose different dispersion measures and to justify their usefulness. Although the standard deviation is used ubiquitously for characterization of variability, it is not well suited to quantify certain “intuitively intelligible” properties of the underlying probability distribution. For example highly variable data might not be random at all if it only consists of “very small” and “very large” values. Although the probability density function (or histogram of data) provides a complete view, one needs quantitative methods in order to make a comparison between different experimental scenarios.
The methodology investigated here does not adhere to any specific field of applications. We believe, that the general results are of interest to a wide group of researchers who deal with positive continuous random variables, in theory or in experiments.
Section snippets
Generic case: standard deviation
We consider a continuous positive random variable (r.v.) T with a probability density function (p.d.f.) f(t) and finite first two moments. Generally, statistical dispersion is a measure of “variability” or “spread” of the distribution of r.v. T, and such a measure has the same physical units as T. There are different dispersion measures described in the literature and employed in different contexts, e.g., standard deviation, inter-quartile range, mean difference or the LV coefficient [6], [8],
Extrema of variability
Generally, the value can be any non-negative real number, . The lower bound, , is approached by a p.d.f. highly peaked at the mean value, in the limit corresponding to the Dirac’s delta function, . There is, however, no unique upper bound distribution for which . For example, the p.d.f. examples analyzed in the next section allow arbitrarily high values of CV and yet their shapes are different.
Extrema of entropy and its relation to variability
The relation between CV and entropy was investigated in a series of
Lognormal and Pareto distributions
Both lognormal and Pareto distributions appear in a broad range of scientific applications [16]. The lognormal distribution is found in the description of, e.g., concentration of elements in the Earth’s crust, distribution of organisms in environment or in human medicine, see [24] for a review. The Pareto distribution is often described as an alternative model in situations similar as in the lognormal case, e.g, the sizes of human settlements, sizes of particle or allocation of wealth among
Discussion and conclusions
We propose and discuss two measures of statistical dispersion for continuous positive random variables: the entropy-based dispersion and the Fisher information-based dispersion . Both and describe the overall spread of the distribution differently than the coefficient of variation. While is most sensitive to the concentration of the probability mass (the predictability of random variable outcomes), is sensitive to the modes of the p.d.f. or any non-smothness in the p.d.f.
Acknowledgements
This work was supported by the Institute of Physiology RVO:67985823, the Centre for Neuroscience P304/12/G069 and by the Grant Agency of the Czech Republic Projects P103/11/0282 and P103/12/P558.
References (38)
- et al.
On minimum Fisher information distributions with restricted support and fixed variance
Inform. Sci.
(2009) - et al.
Perception of categories: from coding efficiency to reaction times
Brain Res.
(2012) - et al.
Comparative characterization of two-electron wavefunctions using information-theory measures
Phys. Lett. A
(2009) - et al.
Fisher’s information and the analysis of complex signals
Phys. Lett. A
(1999) The Pareto Zipf and other power laws
Econ. Lett.
(2001)- et al.
The Fisher–Shannon information plane for atoms
Phys. Lett. A
(2008) - et al.
Fisher information measure of geoelectrical signals
Physica A
(2005) - et al.
Analysis of signals in the Fisher–Shannon information plane
Phys. Lett. A
(2003) - et al.
Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables
(1965) - et al.
A maximum entropy approach to natural language processing
Comput. Linguist.
(1996)
Assessment of spike activity in the supraoptic nucleus
J. Neuroendocrinol.
Ethical Social Index Numbers
Elements of Information Theory
Mathematical Methods of Statistics
Stimulus-dependent modulation of spike burst length in cat striate cortical cells
J. Neurophysiol.
Inducing features of random fields
IEEE Trans. Pattern Anal. Mach. Int.
Firing variability is higher than deduced from the empirical coefficient of variation
Neural Comput.
Physics from Fisher Information: A Unification
Nonparametric roughness penalties for probability densities
Biometrika
Cited by (30)
Deep learning-driven underwater polarimetric target detection based on the dispersion of polarization characteristics
2024, Optics and Laser TechnologyEstimation of the instantaneous spike train variability
2023, Chaos, Solitons and FractalsDecentralised finance's timocratic governance: The distribution and exercise of tokenised voting rights
2023, Technology in SocietyQuantum information-entropic measures for exponential-type potential
2020, Results in PhysicsTotal cost of risk for privatized electric power generation under pipeline vandalism
2018, HeliyonCitation Excerpt :The lost MWh standard deviation, δ, of the case study plant as calculated from Eq. (14) is 48,176.776 MWh. The realized value of standard deviation indicates how off-centered the series of lost MWh power is from the mean value [44]. This resultant risk value is large and amounts to unserved electricity.