Automatic production of quantisation matrices based on perceptual modelling of wavelet coefficients for grey scale images

https://doi.org/10.1016/j.imavis.2009.10.007Get rights and content

Abstract

Wavelet domain statistical models have been shown to be useful for certain applications, e.g. image compression, watermarking and Gaussian noise reduction. One of the main problems for wavelet-based compression is to overcome quantisation error efficiently. Inspired by Weber–Fechners Law, we introduce a logarithmic model that approximates the non-linearity of human perception and partially precompensates for the effect of the display device. A logarithmic transfer function is proposed in order to spread the coefficients distribution in the wavelet domain in compliance with the human perceptual attributes. The standard deviation σ of the logarithmically-scaled coefficients in a subband represents the average difference from the mean of the coefficients in that subband. The standard deviation is chosen as a measure of the visibility threshold within this subband. Computing the values of σ’s for all subbands results in a quantisation matrix for a chosen image. The quantisation matrix is then scaled by a factor ρ in order to provide the best trade-off between the visual quality and the bit-rate of the processed image. A major advantage of this model is to allow for observing the visibility threshold and automatically produce the quantisation matrix that is content dependant and scalable without further interaction from the user. The experimental results have proven the model works for any wavelet.

Introduction

Wavelet algorithms have been widely studied for certain image processing applications. Image compression and watermarking are the most common among these applications [33], [14]. Hence, wavelets are considered a powerful tool due to their characteristics of performing multiresolution decomposition [19] which allows for identifying and separating the more significant and the less significant coefficients. Wavelets can also decorrelate the coherent portions of the image for the purpose of reducing the number of significant coefficients required to represent this portion locally [3].

Wavelet coefficients should exhibit a certain structure or behaviour that can be well modelled to extract the relevant information. The feature extraction depends on the requirements of that certain application. However, modelling wavelet coefficients is a complex task because it must involve different factors, e.g. the nature of the image, the human visual perception and a response of the visualisation device.

Feature extraction that satisfy the human perception and display factors requires choosing an appropriate domain to process the image information. A non-linear representation of images is required to bring the human visual perception and/or the characteristics of the display device into consideration. In addition, when a non-linear domain is chosen, choosing the type of transform function is essential. The γ-correction function and the logarithmic scaling function are the most popular transform functions in the image processing field. Non-linear representation of the images is also widely used within the computer graphics community to code the brightness and colour [26]. The model presented in this paper is motivated by Weber–Fechner’s Law. Weber–Fechner’s Law describes the differential sensitivity of the human perception using a logarithmic transfer function to represent the perceived intensity of a certain source.

The work presented in this paper describes a wavelet-based statistical mode. The proposed model provides an automatic production of a quantisation matrix that serves image compression application. The model exploits the differential sensitivity of the human visual system to 8-bits grey scale images to produce quantisation matrix. The derived quantisation matrix produces errors below the visibility threshold of the human eye.

The image compression mechanism that is proposed by the Joint Photographic Expert Group (JPEG) [2] is todays still image lossy compression standard and it is used for natural images. It combines block implementation of the Discrete Cosine Transform (DCT) quantisation technique and then Huffman coding. Although these methods are efficient even if lower average bit-rate is employed, the block noise (artifact) appears in the resulting image [9], [13]. Nevertheless, none of the existing proposed techniques provide an automatic way to calculate a wavelet-dependent and image-content adapting quantisation matrix as we are proposing.

The organization of the paper is as it follows: In Section 2, an overview on modelling wavelet coefficients as well as the theoretical justifications of processing the wavelet coefficients in the non-linear domain are given. The proposed model and feature extraction on subband basis, is detailed in Section 3. Results on tests of the effectiveness of the subband-based features are reported in Section 4. Conclusions and future research directions can be found in Section 6.

Section snippets

Theoretical background

Due to the multi-channel nature of the human visual system (HVS), researchers in the field have paid attention to the wavelet multi-channel features. This is because it supports representing the image into spatial-frequency and orientation components. Hence, this representation facilitates integration of HVS properties into the quantisation stage. The invisibility of quantisation errors in the reconstructed image necessitates a good representation of the image contents and appropriately chosen

Logarithmic modelling of wavelet coefficients

Motivated by the non-linearity of the human visual perception, non-linearity of display devices γ, and non-Laplacian behaviour of the wavelet coefficients, the transform function derived in the previous section (Eq. (7)) is proposed in order to transform the image coefficients in the wavelet domain. This function is proposed because it: (a) approximates the human visual perception, and (b) takes the display’s γ into consideration, and (c) provides a uniform distribution of the wavelet

Producing image content dependent quantisation matrices

From this model, a quantisation matrix can be automatically computed based on defining a visibility threshold which produces quantisation errors just below the visibility threshold of the human eye. The procedure to compute this matrix is independent of the wavelet basis functions but it produces image-dependent quantisation matrices (Qlog) for a specific wavelet. In one sense, one can also say that this leads to a basis-dependent quantisation matrices. The idea is to achieve better perceptual

Experimentation results

Objective methods for assessing perceptual image quality attempted to quantify the visibility of differences between a distorted image and a reference. The most widely used full-reference quality metric are the mean squared error (MSE) computed by averaging the squared intensity differences of distorted and reference image pixels, along with the related quantity of peak signal-to-noise ratio (PSNR). Due to the simplicity and the clear interpretation, these objective quality metrics are

Conclusions

In this paper we presented a statistical model of the wavelet coefficients that provides an automatic way to generate a scalable and image-dependent quantisation matrices for wavelet-based compression. A proposed log-based model that generates quantisation matrices for grey scaled-images and produces quantisation errors below the visibility threshold.

The proposed logarithmic transform function of the wavelet coefficients is based on Werber–Fechner’s Law on human perception. The proposed

References (33)

  • M.G. Albanesi et al.

    An HVS-based adaptive coder for perceptually lossy image compression

    Pattern Recognition

    (2002)
  • I. Andreopoulos, Y.A. Karayiannis, T. Stouraitis, A hybrid image compression algorithm based on fractal coding and...
  • G. Beylkin, A. Vassiliou, Wavelet Transforms and Compression of Seismic Data, Mathematical Geographics Summer School,...
  • C. Bloch, U. Steinmller, D. Bethke, B. Vogl, The HDRI Handbook: High Dynamic Range Imaging for Photographers and CG...
  • R.W. Buccigrossi et al.

    Image compression via joint statistical characterization in the wavelet domain

    IEEE Image Processing

    (1999)
  • R. Calderbank et al.

    Lossless image compression using integer to integer wavelet transforms

  • D.M. Chandler, S.S. Hemami, Contrast-based quantization and rate control for wavelet-coded images, in: Proceedings on...
  • H.A. Chipman et al.

    Adaptive bayesian wavelet shrinkage

    Journal of the American Statistical Association

    (1997)
  • A. Jakulin, Base Line JPEG and JPEG2000 Aircrafts Illustrated, Visicron,...
  • D. Jameson et al.

    Handbook of Sensory Physiology

    (1972)
  • E. Jordon, A power law for contrast discrimination, Vision Research 21...
  • A.J. Ahumada Jr., H.A. Peterson, Luminance-model-based dct quantization for color image compression, in: B.E. Rogowitz...
  • H. Kondo, Y. Oishi, Digital image compression using directional sub-block dct, in: International Conference on...
  • E. Lam

    Statistical modelling of the wavelets coefficients with different bases and decomposition levels

    IEE Proceedings – Vision Image and Signal Processing

    (2004)
  • J. Li et al.

    Context-based multiscale classification of document images using wavelet coefficient distributions

    IEEE Transactions on Image Processing

    (2000)
  • Zhen Liu et al.

    JPEG2000 encoding with perceptual distortion control

    IEEE Transactions on Image Processing

    (2006)
  • Cited by (1)

    View full text