A Computationally Efficient Information Estimator for Weighted Data

Hino, Hideitsu; Murata, Noboru

doi:10.1007/978-3-642-21738-8_39

Hideitsu Hino¹⁹ &
Noboru Murata¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6792))

Included in the following conference series:

International Conference on Artificial Neural Networks

2353 Accesses
3 Citations

Abstract

The Shannon information content is a fundamental quantity and it is of great importance to estimate it from observed dataset in the field of statistics, information theory, and machine learning. In this study, an estimator for the information content using a given set of weighted data is proposed. The empirical data distribution varies depending on the weight. The notable features of the proposed estimator are its computational efficiency and its ability to deal with weighted data. The proposed estimator is extended in order to estimate cross entropy, entropy and KL divergence with weighted data. Then, the estimators are applied to classification with one-class samples, and distribution preserving data compression problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cover, T.M., Thomas, J.A.: Elements of information theory. J. Wiley, Chichester (1991)
Book Google Scholar
Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. J. Wiley, Chichester (2001)
Book Google Scholar
Learned-Miller, E.G., Fisher, J.W.: ICA using spacings estimates of entropy. JMLR 4(7-8), 1271–1295 (2004)
MathSciNet MATH Google Scholar
Wand, M.P., Jones, M.C.: Kernel Smoothing. Chapman & Hall/CRC (1994)
Google Scholar
Beirlant, J., Dudewicz, E.J., Györfi, L., Meulen, E.C.: Nonparametric entropy estimation: An overview. International Journal of the Mathematical Statistics Sciences 6, 17–39 (1997)
MathSciNet MATH Google Scholar
Kozachenko, L.F., Leonenko, N.N.: Sample estimate of entropy of a random vector. Problems of Information Transmission 23, 95–101 (1987)
MATH Google Scholar
Wang, Q., Kulkarni, S.R., Verdú, S.: Divergence estimation for multidimensional densities via k-nearest-neighbor distances. IEEE Trans. Inf. Theor. 55(5), 2392–2405 (2009)
Article MathSciNet Google Scholar
Faivishevsky, L., Goldberger, J.: ICA based on a Smooth Estimation of the Differential Entropy. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) NIPS 2008, pp. 433–440. MIT Press, Cambridge (2008)
Google Scholar
Rätsch, G., Onoda, T., Müller, K.R.: Soft margins for adaboost. Machine Learning 42(3), 287–320 (2001)
Article Google Scholar
Schölkopf, B., Platt, J.C., Shawe-Taylor, J.C., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Computation 13(7), 1443–1471 (2001)
Article Google Scholar
Karatzoglou, A., Smola, A., Hornik, K., Zeileis, A.: kernlab - An S4 package for kernel methods in R. Journal of Statistical Software 11(9) (2004)
Google Scholar
Kohonen, T., Hynninen, J., Kangas, J., Laaksonen, J., Torkkola, K.: LVQ PAK:the learning vector quantization program package. Tecnical report., Helsinki University of Technology (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

Waseda University, Okubo 3–4–1, Shinjuku-ku, Tokyo, 169–8555, Japan
Hideitsu Hino & Noboru Murata

Authors

Hideitsu Hino
View author publications
You can also search for this author in PubMed Google Scholar
Noboru Murata
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information and Computer Science, Aalto University School of Science, P.O. Box 15400, 00076, Aalto, Finland
Timo Honkela & Samuel Kaski &
School of Physics, Astronomy and Informatics, Department of Informatics, Nicolaus Copernicus University, ul. Grudziadzka 5, 87-100, Torun, Poland
Włodzisław Duch
Department of Statistical Science, University College London, 1-19 Torrington Place, WC1E 7HB, London, UK
Mark Girolami

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hino, H., Murata, N. (2011). A Computationally Efficient Information Estimator for Weighted Data. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2011. ICANN 2011. Lecture Notes in Computer Science, vol 6792. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21738-8_39

Download citation

DOI: https://doi.org/10.1007/978-3-642-21738-8_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21737-1
Online ISBN: 978-3-642-21738-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics