Skip to main content

A Computationally Efficient Information Estimator for Weighted Data

  • Conference paper
Artificial Neural Networks and Machine Learning – ICANN 2011 (ICANN 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6792))

Included in the following conference series:

Abstract

The Shannon information content is a fundamental quantity and it is of great importance to estimate it from observed dataset in the field of statistics, information theory, and machine learning. In this study, an estimator for the information content using a given set of weighted data is proposed. The empirical data distribution varies depending on the weight. The notable features of the proposed estimator are its computational efficiency and its ability to deal with weighted data. The proposed estimator is extended in order to estimate cross entropy, entropy and KL divergence with weighted data. Then, the estimators are applied to classification with one-class samples, and distribution preserving data compression problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cover, T.M., Thomas, J.A.: Elements of information theory. J. Wiley, Chichester (1991)

    Book  Google Scholar 

  2. Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. J. Wiley, Chichester (2001)

    Book  Google Scholar 

  3. Learned-Miller, E.G., Fisher, J.W.: ICA using spacings estimates of entropy. JMLR 4(7-8), 1271–1295 (2004)

    MathSciNet  MATH  Google Scholar 

  4. Wand, M.P., Jones, M.C.: Kernel Smoothing. Chapman & Hall/CRC (1994)

    Google Scholar 

  5. Beirlant, J., Dudewicz, E.J., Györfi, L., Meulen, E.C.: Nonparametric entropy estimation: An overview. International Journal of the Mathematical Statistics Sciences 6, 17–39 (1997)

    MathSciNet  MATH  Google Scholar 

  6. Kozachenko, L.F., Leonenko, N.N.: Sample estimate of entropy of a random vector. Problems of Information Transmission 23, 95–101 (1987)

    MATH  Google Scholar 

  7. Wang, Q., Kulkarni, S.R., Verdú, S.: Divergence estimation for multidimensional densities via k-nearest-neighbor distances. IEEE Trans. Inf. Theor. 55(5), 2392–2405 (2009)

    Article  MathSciNet  Google Scholar 

  8. Faivishevsky, L., Goldberger, J.: ICA based on a Smooth Estimation of the Differential Entropy. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) NIPS 2008, pp. 433–440. MIT Press, Cambridge (2008)

    Google Scholar 

  9. Rätsch, G., Onoda, T., Müller, K.R.: Soft margins for adaboost. Machine Learning 42(3), 287–320 (2001)

    Article  Google Scholar 

  10. Schölkopf, B., Platt, J.C., Shawe-Taylor, J.C., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Computation 13(7), 1443–1471 (2001)

    Article  Google Scholar 

  11. Karatzoglou, A., Smola, A., Hornik, K., Zeileis, A.: kernlab - An S4 package for kernel methods in R. Journal of Statistical Software 11(9) (2004)

    Google Scholar 

  12. Kohonen, T., Hynninen, J., Kangas, J., Laaksonen, J., Torkkola, K.: LVQ PAK:the learning vector quantization program package. Tecnical report., Helsinki University of Technology (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hino, H., Murata, N. (2011). A Computationally Efficient Information Estimator for Weighted Data. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2011. ICANN 2011. Lecture Notes in Computer Science, vol 6792. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21738-8_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21738-8_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21737-1

  • Online ISBN: 978-3-642-21738-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics