Linking visual saliency deviation to image quality degradation: A saliency deviation-based image quality index

doi:10.1016/j.image.2019.04.007

Signal Processing: Image Communication

Volume 75, July 2019, Pages 168-177

https://doi.org/10.1016/j.image.2019.04.007 Get rights and content

Highlights

•
Visual quality degradation is linked and modelled with saliency deviation for the first time.
•
The proposed metric avoids ad-hoc saliency weighting and corresponding computational cost.
•
Both saliency and quality information of the test stimuli are provided.

Abstract

Advances in image quality research have shown the benefits of modeling functional components of the human visual system in image quality metrics. Recently, visual saliency, an important aspect of the human visual system, is increasingly investigated in relation to visual quality perception. Existing studies have showed that incorporating visual saliency leads to improved performance of image quality metrics. However, current applications of visual saliency in image quality metrics mainly focus on the extension of a specific metric with a specific visual saliency model. Issues regarding the optimal use of visual saliency in image quality metrics remain. Psychophysical experiments conducted in the literature have revealed that visual artifacts occurring in an image can change fixation deployment relative to that of the image without distortion. As such, instead of using saliency models as add-ons to image quality metrics, we explored the approach of directly assessing image quality by measuring the visual saliency deviation triggered by visual artifacts. We first analyzed the relationship between visual saliency deviation and image quality degradation on the basis of a large-scale eye-tracking dataset. A saliency deviation-based image quality index was then devised. Experimental results showed that the proposed metric features high prediction accuracy and relatively low computational cost.

Introduction

In current image processing pipeline, image signals are subject to visual distortions in the phases including data acquisition, compression, transmission and reproduction [1]. The perceived visual artifacts occurred in images could affect the visual experience of end users and may even cause interpretation errors in visual inspection tasks. Therefore, computational image quality metrics (IQMs) have emerged as an important component in modern imaging systems for the automatic assessment of perceived image quality [2]. Nowadays, IQMs designed for different purposes are widely available for a broad range of applications, e.g., for fine tuning image and video processing pipelines, evaluating image and video enhancement algorithms, and quality monitoring and control of displays. Depending on to what extent they utilize the undistorted reference, IQMs can be classified into full-reference (FR), reduced-reference (RR) and no-reference (NR) metrics. In this paper, we focused on devising a full-reference metric where the distortion-free image is required.

As a traditional pixel-wise fidelity metric, peak signal-to-noise ratio (PSNR) directly measures the quality as the intensity change of image signals and it does not well correlate with the human quality perception. Over the past decades, substantial progress has been made on the exploring and modeling the functionality aspects of the HVS. Advances in human vision research have allowed us to integrate HVS features into the design of the so-called perceptual-driven metrics [3], [4], [5], [6], [7]. PSNR-HA and PSNR-HMA [3] were proposed by taking into account the mean shift sensitivity and contrast sensitivity feature of the HVS when calculating the PSNR. Visual Signal-to Noise Ratio (VSNR) [4] was designed by weighting the Signal-to-noise ratio (SNR) with the wavelet-based model of visual masking. CSV [5] models the image quality using contrast sensitivity of retinal cells along with the differences of color and structural information between original and distorted images. Noise Quality Measure (NQM) [6] was proposed considering the contrast sensitivity, local luminance, contrast interaction between spatial frequencies and contrast masking effects of the HVS. Most Apparent Distortion (MAD) [7] models the local luminance and contrast masking features of the HVS and estimates image quality using variable strategies depending on the strength of distortions.

Instead of simulating specific features of the HVS, another kind of IQMs, namely the signal-driven IQMs concern the overall functionality of the HVS and focus on the analysis of image statistics and distortions [8], [9], [10], [11], [12]. By using the observation that HVS is highly adapted to extract structural information from visual scenes, Structure SIMilarity Index (SSIM) [8] quantifies image quality using luminance, contrast and structure features. It was further extended to an information weighted (IW-SSIM) [9] version with the information content pooling strategy. Similarly, based on the observation that visual system focus on structures and segments, Perceptual SIMilarity (PerSIM) metric [10] estimates image quality based on Log features and chroma similarity. Feature Similarity Index (FSIM) [11] utilizes the phase congruency and gradient to calculate the local distortions. Visual Information Fidelity (VIF) metric [12]estimates image quality by quantifies how much of the information present in the reference image can be extracted from the distorted image.

In recent years, designing learning-based IQMs have gained popularity with the development of artificial intelligence. The basic strategy of this kind is to learn a regression model that maps the image features to a quality score. Various regression methods, including support vector regression (SVR) [13], neural network [14], [15], random forest regression [16] and deep learning framework [17], [18], [19] are widely used for model learning. In [13], inspired by the free-energy-based brain theory, structural-related features and HVS-inspired features are extracted. A quality estimator is then derived using the support vector machine based regression module. In [14], authors proposed a compact multi-task convolutional neural network to simultaneously estimate image quality and identify distortion type in the image. In [15], researchers developed an IQM based on phase, entropy and gradient features fed to a regression neural network. In [16], authors represented an image using a feature vector and adopted a random forest to train the regression model which maps feature vector to the subjective score. In [17], a deep learning network was designed to classify an image into five quality grades and then convert the these five qualitative labels to numerical scores. In [18], author developed a convolutional networks for image quality assessment combining the learning and regression as a complete optimization process. In [20], an Unsupervised Image Quality Estimation (UNIQUE) metric was proposed based on comparing the monotonicity of sparse representations learned from generic image databases.

Recent research in [21] have showed that current visual quality modeling is lack of sophistication when dealing with real-world complexity, thus making image quality assessment a continuous research topic. Improvement the reliability of IQMs, therefore, rests with the further understanding of the HVS as well as modeling its aspects that are relevant to visual quality perception. To this end, one of the growing trends in image quality research is to investigate how visual attention impact the image quality judgement. Visual attention is a mechanism of the HVS that drives the selection of visual information in a visual scene [22]. The bottom-up, stimulus-driven part of the attentional mechanism is often referred as visual saliency in computer vision community [23]. A computational saliency model generates a topographic map to represent the conspicuousness of scene locations [24].

The underlying reason to incorporate saliency information in IQMs is that visual artifacts occurred in salient regions are considered to have higher impact on image quality perception than those in the non-salient regions [25]. As such, research in the literature mainly focused on the extension of a specific metric with a specific visual saliency model. The most common method adopted is to weight local distortions with local saliency [26], [27], [28], [29], resulting in a so-called saliency-weighted metric. For example, Moorthy et al. [26] integrated an existing saliency model called GAFFE [30] in the SSIM metric, achieving an improvement of 1% to 4% in terms of metric performance. In [27], an NR metric for evaluating the JPEG2000 compression artifacts was designed using the saliency model proposed in [31]. Experimental results demonstrated that the saliency information yielded significant improvements in the performance of the IQM without saliency. It should be noted this saliency weighting approach implicitly assumes that the attentional mechanism of the HVS functions in a post-processing manner when assessing image quality. The complicated interaction between visual saliency and image quality is, however, neglected. The added value of visual saliency in IQMs strongly depends on the saliency model, the IQM, and the characteristics of the test image [32]. Therefore, the obtained added value is often limited and even to be negative [33]. Moreover, this saliency weighting approach requires extra computational cost for generating saliency maps and refining the importance of local distortions. It may further limit the deployment of these saliency-weighted metric in any real-time applications. Therefore, exploring perceptually optimized ways to utilize saliency information in IQMs is worth further investigation.

In this paper, rather than using saliency as an add-on to extend existing IQMs, we explored the approach of modeling image quality by quantifying the saliency deviation induced by visual artifacts. We hypothesized that saliency deviation caused by visual artifacts may be used as a proxy for the visual quality degradation also caused the artifacts. To validate this hypothesis, we first investigated the relationship between saliency deviation and quality degradation on the basis of a large-scale eye-tracking database. We then considered how the findings can be used to propose an IQM, resulting in a saliency deviation-based IQMs which get rid of the disadvantages existed in the traditional saliency-weighting approach.

Section snippets

Related work

Eye-tracking experiments have been conducted to understand visual saliency in relation to image quality assessment [34], [35], [36], [37]. For example, an eye-tracking study was conducted in [34] to investigate how saliency fixations of undistorted images may be affected by visual distortions. Based on the inspection of eye-tracking data, the authors concluded that compression artifacts can affect the saliency deployment, whilst white noise and blurring were not observed to impact the fixation

Saliency deviation analysis

Our eye-tracking study has revealed preliminary findings regarding the relationship between saliency deviation and quality degradation. However, it is still insufficient to draw the conclusion that image quality can be directly modeled by saliency deviation. A more in-depth analysis was therefore conducted in this section to further clarify the knowledge on the interaction between visual saliency and perceptual quality.

Saliency deviation index

In this section, we measured the saliency deviation between the reference image and its distorted version from three aspects. More specifically, global saliency deviation, local saliency deviation and chrominance-induced saliency deviation are considered. These three aspects correspond to the fixation deployment changes caused by strong pop-out artifacts, uniformly distributed visual artifacts and chromatic channel artifacts respectively. The saliency dispersion factor, as discussed in Section 3

Experimental results and discussions

The performance of SDI was evaluated on three widely used image quality assessment databases. These databases are LIVE [40], CSIQ [53] and TID2013 [48]. We also compared its performance with another 12 state of the art IQMs regarding the prediction accuracy and complexity. The IQMs used for comparison are SSIM [8], IW-SSIM [54], VIF [12], VSNR [4], MAD [7], FSIM $_{C}$ [11], GSM [55], CSV [5], UNIQUE [20], PerSIM [10], PSNR-HA [3] and PSNR-HMA [3]. The performance evaluation criteria used are SROCC,

Conclusion

In this paper, we explored the relationship between visual saliency deviation and visual quality degradation. We found that visual saliency deviation triggered by visual artifacts is well-correlated with the image quality degradation also caused by the distortions. Our psychophysical findings provide a better understanding of how visual saliency plays a role in image quality assessment. Moreover, the empirical evidence as revealed in our study provides a new approach of developing

Acknowledgments

This work was supported by the Fundamental Research Funds for the Central Universities under Grant JB180105 and the National Nature Science Foundation of China under Grant 61801364.

References (59)

TemelD. et al.
Csv: image quality assessment based on color, structure, and visual system
Signal Process., Image Commun.
(2016)
FecteauJ.H. et al.
Salience, relevance, and firing: a priority map for target selection
Trends Cogn. Sci.
(2006)
RöhrbeinF. et al.
How does image noise affect actual and predicted human gaze allocation in assessing image quality?
Vis. Res.
(2015)
TatlerB.W. et al.
Visual correlates of fixation selection: effects of scale and time
Vis. Res.
(2005)
PetersR.J. et al.
Components of bottom-up gaze allocation in natural images
Vis. Res.
(2005)
OlivaA. et al.
Coarse blobs or fine edges? evidence that information diagnosticity changes the perception of complex visual stimuli
Cogni. psychol.
(1997)
PonomarenkoN. et al.
Image database tid2013: peculiarities, results and perspectives
Signal Process., Image Commun.
(2015)
WangZ.
Applications of objective image quality assessment methods
IEEE Signal Process. Mag.
(2011)
WangZ. et al.
Modern Image Quality Assessment
(2006)
PonomarenkoN. et al.
Modified image visual quality metrics for contrast change and mean shift accounting

ChandlerD.M. et al.

VSNR: a wavelet-based visual signal-to-noise ratio for natural images

IEEE Trans. Image Process.

(2007)

Damera-VenkataN. et al.

Image quality assessment based on a degradation model

IEEE Trans. Image Process.

(2000)

LarsonE.C. et al.

Most apparent distortion: full-reference image quality assessment and the role of strategy

J. Electron. Imaging

(2010)

WangZ. et al.

Image quality assessment: from error visibility to structural similarity

IEEE Trans. Image Process.

(2004)

WangZ. et al.

Information content weighting for perceptual image quality assessment

IEEE Trans. Image Process.

(2011)

TemelD. et al.

PerSIM: Multi-resolution Image quality assessment in the perceptually uniform color domain

ZhangL. et al.

Fsim: a feature similarity index for image quality assessment

IEEE Trans. Image Process.

(2011)

SheikhH.R. et al.

Image information and visual quality

IEEE Trans. Image Process.

(2006)

GuK. et al.

Using free energy principle for blind image quality assessment

IEEE Trans. Multimed.

(2015)

KangL. et al.

Simultaneous estimation of image quality and distortion via multi-task convolutional neural networks

LiC. et al.

Blind image quality assessment using a general regression neural network

IEEE Trans. Neural Netw.

(2011)

ZhangL. et al.

Training quality-aware filters for no-reference image quality assessment

IEEE MultiMedia

(2014)

HouW. et al.

Blind image quality assessment via deep learning

IEEE Trans. Neural Netw. Learn. Syst.

(2015)

KangL. et al.

Convolutional neural networks for no-reference image quality assessment

BosseS. et al.

Deep neural networks for no-reference and full-reference image quality assessment

IEEE Trans. Image Process.

(2018)

TemelD. et al.

UNIQUE: Unsupervised Image quality estimation

IEEE Signal Process. Lett.

(2016)

GhadiyaramD. et al.

Massive online crowdsourced study of subjective and objective picture quality

IEEE Trans. Image Process.

(2016)

BorjiA. et al.

Quantitative analysis of human-model agreement in visual saliency modeling: A comparative study

IEEE Trans. Image Process.

(2013)

IttiL. et al.

A model of saliency-based visual attention for rapid scene analysis

IEEE Trans. Pattern Anal. Mach. Intell.

(1998)

Cited by (7)

DeepRPN-BIQA: Deep architectures with region proposal network for natural-scene and screen-content blind image quality assessment
2022, Displays
Citation Excerpt :
Speeded-up robust features, gradient map extracted from image patches, and visual saliency index were used with the euclidean distance to predict the image quality score [77]. In [78], the NR-IQA technique was proposed that assumes that visual artifacts cause visual saliency deviation in the image. The technique uses this visual deviation in the visual saliency map to assess the quality of an image.
With the emerging use of technology and screen-oriented applications in our daily life, screen content images have gained the same importance as natural scene images. This results in many natural-scene and screen-content blind image quality assessment (BIQA) models to evaluate the perceptual quality without any prior information regarding the reference image. Recently, patch-based techniques for image quality assessment (IQA) have shown promising results. As per our knowledge, no IQA technique in literature is available that can be equally effective for both natural-scene and screen-content images. In this work, we have proposed a deep architecture with a region proposal network (RPN) for blind natural-scene and screen-content image quality assessment, named DeepRPN-BIQA. The proposed architecture computes visual saliency using RPN to extract important regions having a high contribution towards the image quality. Important regions are extracted by utilizing the texture and edges of images by sliding the network over the extracted feature map from deep architectures i.e., VGGNet and ResNet. The regions proposed (RP) that overlap more than 60% are merged into one proposal and are called the region of interest (ROI). The overlap between RPs is computed using anchors having 3 different scales and aspect ratios. A local quality score is computed over each ROI and the total quality score is computed by taking the average of all the local quality scores. Experimental results show that the DeepRPN-BIQA shows a high correlation between mean observer score and predicted quality score and performs better than other models for screen content images, synthetically distorted images, images taken in real-life conditions using mobile phone cameras, and large scale image quality assessment database.
Visual Saliency and Quality Evaluation for 3D Point Clouds and Meshes: An Overview
2022, APSIPA Transactions on Signal and Information Processing
Aircraft detection from large-scale remote sensing images based on visual saliency and CNNs
2022, International Journal of Remote Sensing
Salient region guided blind image sharpness assessment
2021, Sensors
Natural scene statistics model independent no-reference image quality assessment using patch based discrete cosine transform
2020, Multimedia Tools and Applications
Evaluation and Analysis of Landscape Painting Aesthetic Image Quality Assessment Based on Visual Saliency Model
2020, Proceedings of the 4th International Conference on Inventive Systems and Control, ICISC 2020

View all citing articles on Scopus

^☆: No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.image.2019.04.007..

View full text

Linking visual saliency deviation to image quality degradation: A saliency deviation-based image quality index☆

Highlights

Abstract

Introduction

Section snippets

Related work

Saliency deviation analysis

Saliency deviation index

Experimental results and discussions

Conclusion

Acknowledgments

Signal Process., Image Commun.

Trends Cogn. Sci.

Vis. Res.

Vis. Res.

Vis. Res.

Cogni. psychol.

Signal Process., Image Commun.

Applications of objective image quality assessment methods

IEEE Signal Process. Mag.

Modern Image Quality Assessment

Modified image visual quality metrics for contrast change and mean shift accounting

VSNR: a wavelet-based visual signal-to-noise ratio for natural images

IEEE Trans. Image Process.

Image quality assessment based on a degradation model

IEEE Trans. Image Process.

Most apparent distortion: full-reference image quality assessment and the role of strategy

J. Electron. Imaging

Image quality assessment: from error visibility to structural similarity

IEEE Trans. Image Process.

Information content weighting for perceptual image quality assessment

IEEE Trans. Image Process.

PerSIM: Multi-resolution Image quality assessment in the perceptually uniform color domain

Fsim: a feature similarity index for image quality assessment

IEEE Trans. Image Process.

Image information and visual quality

IEEE Trans. Image Process.

Using free energy principle for blind image quality assessment

IEEE Trans. Multimed.

Simultaneous estimation of image quality and distortion via multi-task convolutional neural networks

Blind image quality assessment using a general regression neural network

IEEE Trans. Neural Netw.

Training quality-aware filters for no-reference image quality assessment

IEEE MultiMedia

Blind image quality assessment via deep learning

IEEE Trans. Neural Netw. Learn. Syst.

Convolutional neural networks for no-reference image quality assessment

Deep neural networks for no-reference and full-reference image quality assessment

IEEE Trans. Image Process.

UNIQUE: Unsupervised Image quality estimation

IEEE Signal Process. Lett.

Massive online crowdsourced study of subjective and objective picture quality

IEEE Trans. Image Process.

Quantitative analysis of human-model agreement in visual saliency modeling: A comparative study

IEEE Trans. Image Process.

A model of saliency-based visual attention for rapid scene analysis

IEEE Trans. Pattern Anal. Mach. Intell.