Abstract
In statistical analysis of measurement results it is often necessary to compute the range \([\underline V ,\,\overline V]\) of the population variance \(V = \frac{1}{n}\, \cdot \,\sum\limits_{i = 1}^n (x_i \, - \,E)^2 \,\left({\rm where}\,E = \frac{1}{n}\, \cdot \,\sum\limits_{i = 1}^n {x_i }\,\right)\) when we only know the intervals \([\tilde x_i - \Delta _i,\,\tilde x_i \, + \,\Delta _i]\) of possible values of the x i . While \(\underline {V}\) can be computed efficiently, the problem of computing \(\overline {V}\) is, in general, NP-hard. In our previous paper “Population Variance under Interval Uncertainty: A New Algorithm” (Reliable Computing 12 (4) (2006), pp. 273–280) we showed that in a practically important case we can use constraints techniques to compute \(\overline {V}\) in time O(n · log(n)). In this paper we provide new algorithms that compute \(\underline {V}\) (in all cases) and \(\overline {V}\) (for the above case) in linear time O(n).
Similar linear-time algorithms are described for computing the range of the entropy \(S = - \sum\limits_{i = 1}^n {p_i\,\cdot\,{\rm log} (p_i )}\) when we only know the intervals \({\bf P}_i \, = \,[p_{-i},\,\bar p_i]\) of possible values of probabilities p i .
In general, a statistical characteristic ƒ can be more complex so that even computing ƒ can take much longer than linear time. For such ƒ, the question is how to compute the range \([\underline y,\,\overline y]\) in as few calls to ƒ as possible. We show that for convex symmetric functions ƒ, we can compute \(\bar {y}\) in n calls to ƒ.
Similar content being viewed by others
References
Berberian S. (1974) Lectures in Functional Analysis and Operator Theory. Springer, New York, Heidelberg, Berlin
Boyd, S. and Vandenberghe, L.: Convex Optimization, Cambridge University Press, 2004.
van der Broek, P. and Noppen, J.: Fuzzy Weighted Average: Alternative Approach, in: Proceedings of the 25th International Conference of the North American Fuzzy Information Processing Society NAFIPS’2006, Montreal, Quebec, Canada, June 3–6, 2006.
Cormen Th.H., Leiserson C.E., Rivest R.L., Stein C. (2001) Introduction to Algorithms. MIT Press, Cambridge
Dantsin E., Kreinovich V., Wolpert A., Xiang G. (2006) Population Variance under Interval Uncertainty: A New Algorithm. Reliable Computing 12(4):273–280
Dantsin, E., Wolpert, A., Ceberio, M., Xiang, G., and Kreinovich, V.: Detecting Outliers under Interval Uncertainty: A New Algorithm Based on Constraint Satisfaction, in: Proceedings of the International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems IPMU’06, Paris, France, July 2–7, 2006, pp. 802–809.
Ferson S., Ginzburg L., Kreinovich V., Longpré L., Aviles M. (2005) Exact Bounds on Finite Populations of Interval Data. Reliable Computing 11(3):207–233
Ferson, S., Kreinovich, V., Hajagos, J., Oberkampf, W., and Ginzburg, L.: Experimental Uncertainty Estimation and Statistics for Data Having Interval Uncertainty, Sandia National Laboratories, Report SAND2007-0939, 2007.
Hansen P., de Aragao M.V.P., Ribeiro C.C. (1991) Hyperbolic 0-1 Programming and Optimization in Information Retrieval. Math. Programming 52:255–263
Jaulin L., Kieffer M., Didrit O., Walter E. (2001) Applied Interval Analysis: with Examples in Parameter and State Estimation, Robust Control and Robotics. Springer Verlag, London
Klir G.J. (2005) Uncertainty and Information: Foundations of Generalized Information Theory. J. Wiley, Hoboken, New Jersey
Kreinovich V. (1996) Maximum Entropy and Interval Computations. Reliable Computing 2(1):63–79
Kreinovich V., Longpré L., Patangay P., Ferson S., Ginzburg L. (2005) Outlier Detection under Interval Uncertainty: Algorithmic Solvability and Computational Complexity. Reliable Computing 11(1):59–76
Kreinovich V., Xiang G., Starks S.A., Longpré L., Ceberio M., Araiza R., Beck J., Kandathi R., Nayak A., Torres R., Hajagos J. (2006) Towards Combining Probabilistic and Interval Uncertainty in Engineering Calculations: Algorithms for Computing Statistics under Interval Uncertainty, and Their Computational Complexity. Reliable Computing 12(6):471–501
Nivlet, P., Fournier, F., and Royer, J.: A New Methodology to Account for Uncertainties in 4-D Seismic Interpretation, in: Proc. 71st Annual Int’l Meeting of Soc. of Exploratory Geophysics SEG’2001, San Antonio, September 9–14, 2001, pp. 1644–1647.
Nivlet, P., Fournier, F., and Royer, J.: Propagating Interval Uncertainties in Supervised Pattern Recognition for Reservoir Characterization, in: Proc. 2001 Society of Petroleum Engineers Annual Conf. SPE’2001, New Orleans, September 30–October 3, 2001, paper SPE-71327.
Rabinovich, S.: Measurement Errors: Theory and Practice, American Institute of Physics, New York, 1993.
Roberts A.W., Varberg D.E. (1973) Convex Functions. Academic Press, New York and London
Vavasis S.A. (1991) Nonlinear Optimization: Complexity Issues. Oxford Science, New York
Webster R. (1994) Convexity. Oxford University Press, Oxford, New York, Tokyo
Xiang, G., Kosheleva, O., and Klir, G. J.: Estimating Information Amount under Interval Uncertainty: Algorithmic Solvability and Computational Complexity, in: Proceedings of the International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems IPMU’06, Paris, France, July 2–7, 2006, pp. 840–847.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xiang, G., Ceberio, M. & Kreinovich, V. Computing Population Variance and Entropy under Interval Uncertainty: Linear-Time Algorithms. Reliable Comput 13, 467–488 (2007). https://doi.org/10.1007/s11155-007-9045-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11155-007-9045-6