Image compression via super-resolution and generative models

Agustsson, Eirikur

doi:10.3929/ethz-b-000487536

Download

Full text (PDF, 40.15Mb)

Open access

Author

Agustsson, Eirikur

Date

2020

Type

Doctoral Thesis

ETH Bibliography

yes

Altmetrics

Download

Full text (PDF, 40.15Mb)

Rights / license

In Copyright - Non-Commercial Use Permitted

Abstract

This thesis addresses two central tasks in image processing: single-image super-resolution and image compression with generative models. While super-resolution has traditionally been viewed as a restoration task, it can also be seen as a form of lossy image compression where the encoder downscales the image and decoder reconstructs it back. Generative models, which typically have been studied as a tool for unsupervised learning, can also be used (as we will show) for image compression to synthesize details in the decoder which cannot be stored as a given bitrate. In the first part of this thesis, we study the problem of single image super-resolution, from the perspective of memory efficiency. We investigate Adjusted Anchored Neighborhood Regression (A+), a clustered regression model over a low-resolution dictionary, and propose Regressor Basis Learning (RB) where we restrict the regressor set to a learned low dimensional subspace. We show that RB achieves comparable performance to A+ but by using orders of magnitude fewer basis regressors, which makes it ideal for memory constrained applications. In the second part of this thesis, we look at super-resolution as a generic regression task. We propose the Anchored Regression Network (ARN), a generalization of A+ which is a smoothed relaxation of a piecewise linear regression through the combination of multiple linear regressors over soft assignments to anchor points. We demonstrate the power of the ARN by applying it to two very diverse and challenging tasks: age prediction from face images and image super-resolution, and yield strong results in both cases. In the third part of this thesis, we present a learned image compression system based on Generative Adversarial Networks (GANs), operating at extremely low bitrates. Our proposed model synthesizes details it cannot afford to store, obtaining visually pleasing results at bitrates where previous methods fail and show strong artifacts. A user study confirms that for low bitrates, our approach is preferred to state-of-the-art methods, even when they use more than double the bits. In the fourth and final part of this thesis, we study how to inspect the latent space generative models (such as GANs) and Variational Auto Encoders (VAEs), which are trained for a fixed prior distribution. We show that popular latent space operations (such as interpolating between two samples) can induce a distribution mismatch between the resulting outputs and the prior distribution. To address this, we propose to use distribution matching transport maps to ensure that such latent space operations preserve the prior distribution and experimentally validate that the proposed operations give higher quality samples compared to the original operations. Show more

Permanent link

https://doi.org/10.3929/ethz-b-000487536

Publication status

published

External links

Search print copy at ETH Library

Contributors

Examiner: Van Gool, Luc
Examiner: Ebrahimi, Touradj
Examiner: Vincent, Damien

Publisher

ETH Zurich

Subject

Computer vision; Super-resolution; Compression; Generative models

Organisational unit

03514 - Van Gool, Luc / Van Gool, Luc

More

Show all metadata

ETH Bibliography

yes

Altmetrics

Research Collection

Search

Image compression via super-resolution and generative models Mendeley CSV RIS BibTeX

Image compression via super-resolution and generative models

Mendeley

CSV

RIS

BibTeX