Efficient Data Representations That Preserve Information

Tishby, Naftali

doi:10.1007/978-3-540-39624-6_4

Naftali Tishby⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2842))

Included in the following conference series:

International Conference on Algorithmic Learning Theory

Abstract

A fundamental issue in computational learning theory, as well as in biological information processing, is the best possible relationship between model representation complexity and its prediction accuracy. Clearly, we expect more complex models that require longer data representation to be more accurate. Can one provide a quantitative, yet general, formulation of this trade-off? In this talk I will discuss this question from Shannon’s Information Theory perspective. I will argue that this trade-off can be traced back to the basic duality between source and channel coding and is also related to the notion of “coding with side information”. I will review some of the theoretical achievability results for such relevant data representations and discuss our algorithms for extracting them. I will then demonstrate the application of these ideas for the analysis of natural language corpora and speculate on possibly-universal aspects of human language that they reveal.

Based on joint works with Ran Bacharach, Gal Chechik, Amir Globerson, Amir Navot, and Noam Slonim.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Is My Neural Net Driven by the MDL Principle?

Biological information

Article Open access 19 November 2020

Minimum Description Length Principle

Author information

Authors and Affiliations

School of Computer Science and Engineering, and Center for Neural Computation, The Hebrew University, Jerusalem, 91904, Israel
Naftali Tishby

Authors

Naftali Tishby
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Universitat Politècnica de Catalunya, Barcelona, Spain
Ricard Gavaldá
Meme Media Laboratory, Hokkaido University Sapporo, Kita 13, Nishi 8, Kita-ku, 060-8628, Sapporo, Japan
Klaus P. Jantke
,
Eiji Takimoto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tishby, N. (2003). Efficient Data Representations That Preserve Information. In: Gavaldá, R., Jantke, K.P., Takimoto, E. (eds) Algorithmic Learning Theory. ALT 2003. Lecture Notes in Computer Science(), vol 2842. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39624-6_4

Download citation

DOI: https://doi.org/10.1007/978-3-540-39624-6_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20291-2
Online ISBN: 978-3-540-39624-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Efficient Data Representations That Preserve Information

Abstract

Access this chapter

Similar content being viewed by others

Is My Neural Net Driven by the MDL Principle?

Biological information

Minimum Description Length Principle

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Efficient Data Representations That Preserve Information

Abstract

Access this chapter

Similar content being viewed by others

Is My Neural Net Driven by the MDL Principle?

Biological information

Minimum Description Length Principle

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation