Saliency-induced reduced-reference quality index for natural scene and screen content images

doi:10.1016/j.sigpro.2017.10.025

Signal Processing

Volume 145, April 2018, Pages 127-136

https://doi.org/10.1016/j.sigpro.2017.10.025 Get rights and content

Highlights

•
We develop a saliency-induced reduced-reference (SIRR) IQA measure.
•
In SIRR, image quality is described by the similarity between two images’ saliency maps.
•
SIRR is a cross-content-type measure, which works efficiently for both natural scene images (NSIs) and screen content images (SCIs).
•
Experimental results show that SIRR is comparable to state-of-the-art full-reference and reduced-reference IQA measures in NSIs, and it can outperform most competitors in SCIs.

Abstract

Massive content composed of both natural scene and screen content has been generated with the increasing use of wireless computing and cloud computing, which call for general image quality assessment (IQA) measures working for both natural scene images (NSIs) and screen content images (SCIs). In this paper, we develop a saliency-induced reduced-reference (SIRR) IQA measure for both NSIs and SCIs. Image quality and visual saliency are two widely studied and closely related research topics. Traditionally, visual saliency is often used as a weighting map in the final pooling stage of IQA. Instead, we detect visual saliency as a quality feature since different types and levels of degradation can strongly influence saliency detection. Image quality is described by the similarity between two images’ saliency maps. In SIRR, saliency is detected through a binary image descriptor called “image signature”, which significantly reduces the reference data. We perform extensive experiments on five large-scale NSI quality assessment databases including LIVE, TID2008, CSIQ, LIVEMD, CID2013, as well as two recently constructed SCI QA databases, i.e., SIQAD and QACS. Experimental results show that SIRR is comparable to state-of-the-art full-reference and reduced-reference IQA measures in NSIs, and it can outperform most competitors in SCIs. The most important is that SIRR is a cross-content-type measure, which works efficiently for both NSIs and SCIs. The MATLAB source code of SIRR will be publicly available with this paper.

Introduction

The quick advancements of transmission technologies have boosted various remote applications such as telecommuting and cloud computing, which bring massive computer-generated content called “screen content”. The so-called screen content has some distinctive characteristics different from natural scene because of the contained computer generated content, e.g., texts, icons, tables, graphics, etc. Those distinctive characteristics which sometimes violate natural scene statistics (NSS) cause some failures in traditional natural scene image (NSI) based applications. Hence some specialized technologies for screen content image (SCI) have been proposed, such as screen content video compression [1].

The booming of screen content also calls for SCI-specific image quality measures. Limited work has been done concerning SCI quality assessment (QA). In [2], the authors constructed a screen image quality assessment database (SIQAD), which shows that state-of-the-art image quality assessment (IQA) measures do not work efficiently for SCIs. It is reasonable since current IQA measures are implicitly designed for NSIs and somehow rely on NSS. Wang et al. [3] also constructed a database called quality assessment of compressed SCI (QACS). In [4], the authors proposed a full-reference (FR) saliency-guided quality measure named SQMS for SCI. SQMS exploits gradient magnitude similarity as the quality map, which is then weighted by a specific saliency map. Gu et al. [5], [6] learned blind quality evaluation engines for SCI from a huge group of SCIs and corresponding objective quality scores calculated by FR measures.

Although dozens of NSI quality estimators [7], [8], [9], [10], [11], [12], [13], [14], [15], [16] and several limited SCI quality measures [2], [3], [4], [5], [6] are proposed, they are either implicitly designed for NSIs or specifically developed for SCIs. Only very few quality measures can work for NSIs and SCIs simultaneously. Min et al. [17] proposed a blind blockiness measure which works for JPEG compressed NSIs and SCIs. Xu et al. [18] developed a measure for NSIs and SCIs. In [19], Min et al. constructed a cross-content-type database, and proposed a unified content-type adaptive blind IQA measure for compressed natural, graphic and screen content images. In practical multimedia communication systems, we may encounter both types of images, and sometimes we do not have any prior knowledge about the image types. Efficient general quality measures ignoring image types are highly needed in such circumstances. In this paper, we extract quality features efficient for both types of images and develop a general reduced-reference (RR) quality measure without any explicit image type classification.

The proposed method is based on visual saliency detection. Visual saliency detection is an important research topic in areas of psychology, image processing and computer vision [20]. Visual attention and quality assessment are two closely related research topics [7], [9], [10], [12], [13], [21], [22], [23]. Quality degradation can influence visual attention [21]. Contrarily, visually salient positions should be more carefully processed since subjects judge the image quality according to the observations of some limited positions, and a typical use of visual attention model is to optimize resource allocation and improve the perceptual quality under the constraints of bandwidth [24], [25], [26], [27], [28].

Motivated by the interaction between visual attention and quality assessment, some researchers used visual attention map as a weighting map during the quality pooling stage of IQA [9], [10], [13], [22], [23]. Min et al. [13] collected some visual attention data for main-stream IQA databases. Zhang et al. [22], [23] studied the use of saliency model in objective quality assessment models. Liu et al. [9] used the saliency map to highlight the visually salient areas. Besides highlighting the salient regions, Saha and Wu [10] used the dissimilarity between saliency maps of the reference and distorted images to highlight the more distorted image content. Besides visual attention maps, some measures utilize other kinds of weighting maps such as phase congruency map [7] and gradient magnitude map [12]. Although without explicit visual attention prediction or visual saliency detection processes, such kinds of weighting maps have also highlighted the visually salient positions, which can be also deemed as one kind of visual saliency.

Instead of as a weighting map, visual saliency can be also used as a quality feature since quality degradation can strongly affect saliency detection. Zhang et al. [8] proposed a FR IQA method named VSI by measuring the similarity between the reference image’s and the distorted image’s visual saliency. VSI is a FR measure since it utilizes not only saliency, but also the gradient magnitude and chrominance. All extracted feature maps have the same resolution as the reference image. Actually, deriving a gray scale saliency map from a color image is an operation of dimension reduction, which motivates us to develop a saliency-induced reduced-reference (SIRR) IQA measure.

SIRR detects saliency map of the reference image as the reference data, and then measure the similarity between the reference and distorted images’ saliency maps. We try to reduce the reference data from two aspects. First, we down-sample the reference image to a coarser scale to detect saliency, whose resolution is only one over sixty-four of the original resolution. We take full advantage of such down-sampling operation to reduce the reference data. Second, we exploit a binary image descriptor called “image signature” [29] to detect image saliency. The image saliency is represented by the binary image signature, which also significantly reduces the reference data. The final quality is described by the similarity between two images’ saliency maps. In this work, the similarity is evaluated by the classical image fidelity measure SSIM [30].

We perform extensive experiments to test the proposed SIRR in both NSIs and SCIs. Five large-scale NSI QA databases and two recent SCI QA databases are used. Among the NSI databases, LIVE [31], TID2008 [32] and CSIQ [33] are general-purpose IQA databases, whereas LIVEMD [34] focuses on multiply distorted images and CID2013 [35] consists of contrast changed images. As it to the SCI databases, SIQAD [2] is a general-purpose one and QACS [3] concentrates on compressed SCIs. The all seven databases can give an overall description of both NSIs and SCIs. As will be presented in the experiments part, the proposed SIRR is efficient for both types of images. SIRR can be comparable to or outperform state-of-the-art FR and RR IQA measures on all seven IQA databases.

The remainder of this paper is organized as follows. Section 2 describes the proposed saliency-induced reduced-reference quality measure. Experimental results are given in Section 3. We compare the proposed method with state-of-the-art FR and RR quality measures in this section. Section 4 concludes this paper.

Section snippets

Saliency-induced reduced-reference quality measure

As described in Section 1, visual saliency has been widely used in IQA, but it is generally used as a weighting map during the final pooling. Few work has considered saliency as a quality feature. Most bottom-up saliency models highly rely on the low-level features, which are sensitive to quality degradation. Fig. 1 illustrates the influence of quality degradation on image saliency. From this figure, we can observe that perceptible quality degradation can cause perceptible change of image

Validation of SIRR

The proposed SIRR measure is validated on both NSIs and SCIs. The details are as follows.

Conclusion

In this paper, we propose a saliency-induced reduced-reference (SIRR) IQA measure for the two most common but quite different types of images encountered in realistic multimedia communication systems, i.e., NSI and SCI. We develop SIRR based on the observations that quality degradation can significantly affect saliency detection, and that saliency detection is in fact an operation of dimension and data reduction. SIRR evaluates quality by measuring the similarity between two images’ saliency

Acknowledgments

This work was supported in part by National Natural Science Foundation of China under grants 61422112, 61371146, 61521062, and 61527804.

References (50)

K. Gu et al.
Learning a blind quality evaluation engine of screen content images
Neurocomputing
(2016)
Y. Liu et al.
Stereoscopic image quality assessment method based on binocular combination saliency model
Signal Process.
(2016)
A. Saha et al.
Full-reference image quality assessment by combining global and local distortion measures
Signal Process.
(2016)
F. Gao et al.
Biologically inspired image quality assessment
Signal Process.
(2016)
X. Min et al.
Visual attention analysis and prediction on human faces
Inf. Sci.
(2017)
W. Lin et al.
Perceptual visual quality metrics: a survey
J. Vis. Commun. Image Represent.
(2011)
S. Wang et al.
Reduced-reference quality assessment of screen content images
IEEE Trans. Circuits Syst. Video Technol.
(2016)
J. Xu et al.
Overview of the emerging HEVC screen content coding extension
IEEE Trans. Circuits Syst. Video Technol.
(2016)
H. Yang et al.
Perceptual quality assessment of screen content images
IEEE Trans. Image Process.
(2015)
S. Wang et al.
Subjective and objective quality assessment of compressed screen content images
IEEE J. Emerging Sel. Top. Circuits Syst.
(2016)

Cited by (73)

PGF-BIQA: Blind image quality assessment via probability multi-grained cascade forest
2023, Computer Vision and Image Understanding
Blind image quality assessment (BIQA) aims to automatically predict the perceptual quality of a digital image without accessing its pristine reference and plays an important role in computer vision and image analysis. In fact, the difference in the subjective perception of image quality by evaluators greatly affected quality scores of the images. Aiming at the problem that the existing models cannot take into account the subjective perception differences in people, a blind image quality assessment algorithm via probability gcForest (PGF-BIQA) is proposed. Specifically, we use five qualitative labels to replace the specific quality score and extract image color and texture features to represent image quality. Unlike the widely used neural network, gcForest is ensemble by decision trees which randomly extract different features of various modes to learn the relationship between label and quality features. Therefore, a probability gcForest is designed, combining the frequency probability model and the gcForest, using probability confidence to represent the multiple decision modes results, and describing the subjective perception differences in evaluators. Next, through quality anchors and probability confidences of different classes, PGF-BIQA achieves image quality assessment. Moreover, this method proposes an image classification method that resolves the problem of an unbalanced number of images of different classes and the calculation process of the algorithm is simple and effective. Extensive experiments demonstrate that the proposed method is highly consistent with human perception and outperforms many state-of-the-art BIQA algorithms.
Automatic No-Reference kidney tissue whole slide image quality assessment based on composite fusion models
2023, Biomedical Signal Processing and Control
Citation Excerpt :
The objective image quality assessment (IQA) approaches that can be embedded in the intelligent system are divided into three groups involving full-reference (FR), reduced-reference (RR), and no-reference (NR) [6], according to whether distortion-free images are used as reference. FR-IQA [7] and RR-IQA models [8] evaluated the visual quality by calculating the difference between the reference image and the distorted image. They are reliable, but the application scenarios are extremely limited due to the reference information is difficult to obtain.
The visual quality of digitized whole slide images (WSIs) for kidney tissue not only affects the diagnosis and subsequent treatment, but also has a decisive impact on the accuracy of multiclass segmentation, classification and object detection during computer intelligent analysis. Currently, pathologists usually assess image quality through eye screening, which greatly relies on the pathologist experience and brings about subjectivity and non-repeatability issues. In this paper, we develop a no-reference image quality assessment framework including a fused CNN classification module, a quality score conversion module and a comprehensive quality prediction module, which automatically classifies WSIs of kidney tissue into four quality levels: excellent, good, average, and poor, and calculates a rough quality score. The original image and the regions of interest are combined and fused to comprehensively evaluate the quality of a WSI through multiple factors instead of a simple deep learning network. Extensive experiments conducted on our in-house dataset confirm that our proposed framework obtains satisfactory results with an accuracy of 90.05%, surpassing the performance of the typical image quality assessment models, and achieves the level of junior pathologist. Therefore, our proposed method can be embedded into a computer assisted diagnosis system to help pathologists in analysis of histopathological images and judgment of reliability of the images. The source code and trained models will be available at https://github.com/kidneyPathology/WSIQA.
A brain-inspired computational model for extremely few reference image quality assessment
2023, Displays
Citation Excerpt :
Generally speaking, brain-like intelligence is regarded to be the ultimate goal of artificial intelligence, and has become the focus of attention from countries around the world. Researchers on brain science have obtained lots of important findings, including free energy principle [1,2] and attention perception mechanism [3,4], causing various types of applications in the fields of image and video processing [5–12], vision perception and understanding [13–23], environment monitoring and protection [24–29], etc. Among them, those above two findings play a particularly significant role in the field of image quality assessment (IQA).
In the study of brain science, the free energy principle and attention perception mechanism have been the two of the most critical findings during the past few decades, arousing a wide range of attention and valuable applications from the research fields of image and video processing, computer vision, etc. Motivated by the aforementioned two important findings, we in this paper develop a brain-inspired computational model for extremely few reference image quality assessment (IQA), dubbed as BCM. The proposed BCM implements with the two main steps. First, we combine free energy principle and sparse perception mechanism to achieve the goal of only using extremely few reference for assessing the image quality. Second, we further introduce the attention perception mechanism to boost the assessment performance by improving the sparse perception mechanism mentioned above. Based on the most commonly used image quality database, it was found that our proposed model has derived higher performance than the peer extremely few reference IQA models and competitive performance as compared with the benchmark full reference IQA models.
Multi-level U-net network for image super-resolution reconstruction
2022, Displays
Deep neural networks have shown better effects for super-resolution. However, it is difficult to extract multi-level features of LR images to reconstruct more clear images. To solve this problem, we present a multi-level U-Net network (MUN) for image super-resolution reconstruction. Specifically, we present a multi-level U-Net residual structure, which is composed of two different levels U-Net structures, to extract multi-level features of LR images. Meanwhile, we present a multi-scale residual block, which extracts multi-level features by dual-branch convolutional layers with different kernels and uses long and short skip connections to bypass a large amount of low-frequency information. Extensive experimental results demonstrate that our presented MUN outperforms other state-of-the-art super-resolution methods.
DeepRPN-BIQA: Deep architectures with region proposal network for natural-scene and screen-content blind image quality assessment
2022, Displays
With the emerging use of technology and screen-oriented applications in our daily life, screen content images have gained the same importance as natural scene images. This results in many natural-scene and screen-content blind image quality assessment (BIQA) models to evaluate the perceptual quality without any prior information regarding the reference image. Recently, patch-based techniques for image quality assessment (IQA) have shown promising results. As per our knowledge, no IQA technique in literature is available that can be equally effective for both natural-scene and screen-content images. In this work, we have proposed a deep architecture with a region proposal network (RPN) for blind natural-scene and screen-content image quality assessment, named DeepRPN-BIQA. The proposed architecture computes visual saliency using RPN to extract important regions having a high contribution towards the image quality. Important regions are extracted by utilizing the texture and edges of images by sliding the network over the extracted feature map from deep architectures i.e., VGGNet and ResNet. The regions proposed (RP) that overlap more than 60% are merged into one proposal and are called the region of interest (ROI). The overlap between RPs is computed using anchors having 3 different scales and aspect ratios. A local quality score is computed over each ROI and the total quality score is computed by taking the average of all the local quality scores. Experimental results show that the DeepRPN-BIQA shows a high correlation between mean observer score and predicted quality score and performs better than other models for screen content images, synthetically distorted images, images taken in real-life conditions using mobile phone cameras, and large scale image quality assessment database.
Blind light field image quality assessment by analyzing angular-spatial characteristics
2021, Digital Signal Processing: A Review Journal
Citation Excerpt :
Saad et al. [21] developed an IQA method that the generalized Gaussian density function was modeled by block Discrete Cosine Transform (DCT) coefficients in BLind Image Integrity Notator using DCT Statistics (BLIINDS-II). Min et al. [22] designed a Pseudo-Reference Image (PRI) and integrated the PRI-based distortion-specific metrics into a general-purpose blind IQA method named Blind PRI-based (BPRI) metric. Moreover, they utilized Multiple Pseudo Reference Images (MPRIs) by further degrading the distorted image in several ways and to certain degrees, and then compared the similarities between the distorted image and the MPRIs [23].
Light Field Image (LFI) can simultaneously record the intensity and direction information of light rays, and provide users with strong immersion experience. However, heterogeneous artifacts may be introduced during LFI processing, which results in degradation of the perceptual quality of LFI. To evaluate the LFI quality effectively, a novel blind light field image quality assessment method by analyzing angular-spatial characteristics is proposed. The main strategy is to quantify the LFI quality degradation by evaluating the angular consistency and spatial quality simultaneously. Firstly, the multi-directional intra-block and inter-block differential operations are employed on macro-pixels to generate Macro-Pixel Residual Blocks (MPRBs) on hue, saturation and luminance of LFI. Secondly, effective perceptual feature extraction schemes based on MPRBs entropy distribution and natural scene statistics of discrete cosine transform coefficients for MPRBs are developed to measure the angular consistency on each color descriptor. Thirdly, Macro-Pixel Variance (MPV) map is defined, and the quality-aware features are extracted from MPV map to measure the occlusion areas of LFI. Fourthly, the perceptual features are extracted from sub-aperture images to comprehensively measure the spatial quality of LFI. Finally, all the extracted features are pooled to predict the objective quality of LFI. Extensive experimental results on four LFI datasets show that the proposed method significantly outperforms the representative 2D, 3D, multi-view image quality assessment methods as well as the state-of-the-art LFI quality assessment methods, and is more in line with the human visual perception.

View all citing articles on Scopus

View full text

Saliency-induced reduced-reference quality index for natural scene and screen content images

Highlights

Abstract

Introduction

Section snippets

Saliency-induced reduced-reference quality measure

Validation of SIRR

Conclusion

Acknowledgments

Neurocomputing

Signal Process.

Signal Process.

Signal Process.

Inf. Sci.

J. Vis. Commun. Image Represent.

IEEE Trans. Circuits Syst. Video Technol.

Overview of the emerging HEVC screen content coding extension

IEEE Trans. Circuits Syst. Video Technol.

Perceptual quality assessment of screen content images

IEEE Trans. Image Process.

Subjective and objective quality assessment of compressed screen content images

IEEE J. Emerging Sel. Top. Circuits Syst.

Saliency-guided quality assessment of screen content images

IEEE Trans. Multimedia

No-reference quality assessment of screen content pictures

IEEE Trans. Image Process.

FSIM: a feature similarity index for image quality assessment

IEEE Trans. Image Process.

VSI: a visual saliency-induced index for perceptual image quality assessment

IEEE Trans. Image Process.

A fast reliable image quality predictor by fusing micro-and macro-structures

IEEE Trans. Ind. Electron.

Visual attention data for image quality assessment databases

Proceedings of the IEEE International Symposium on Circuits and Systems

A psychovisual quality metric in free-energy principle

IEEE Trans. Image Process.

Cross-dimensional perceptual quality assessment for low bit-rate videos

IEEE Trans. Multimedia

Model-based referenceless quality metric of 3d synthesized images using local image description

IEEE Trans. Image Process.

Blind quality assessment of compressed images via pseudo structural similarity

Proceedings of the IEEE International Conference on Multimedia and Expo

Blind image quality assessment based on high order statistics aggregation

IEEE Trans. Image Process.

Unified blind quality assessment of compressed natural, graphic, and screen content images

IEEE Trans. Image Process.

State-of-the-art in visual attention modeling

IEEE Trans. Pattern Anal. Mach. Intell.

Influence of compression artifacts on visual attention

Proceedings of the IEEE International Conference on Multimedia and Expo

The application of visual saliency models in objective image quality assessment: a statistical evaluation

IEEE Trans. Neural Netw. Learn. Syst.