Blind image quality assessment with channel attention based deep residual network and extended LargeVis dimensionality reduction☆
Introduction
Image Quality Assessment (IQA) is one of the fundamental problems in the image processing field. In recent years, many achievements have been made [1], [2], [3], in which No-Reference (NR) or Blind IQA (BIQA) has become the hotspots since the reference information is often unavailable (or may not exist) in many practical applications. IQA has been developed from specific distortion to non-specific distortion, and from single distortion to hybrid distortions. IQA for specific distortion is designed for certain distortion types such as color, blur, noise, and JPEG compression effect [4], [5], thereby, they need to know the distortion type of the image in advance. In contrast, IQA for non-specific distortion does not need to distinguish the type and the cause of the distortion [6], [7], [8], [9], which enable it to have wider application than IQA for specific distortion.
Since 2012, deep learning has made tremendous breakthroughs in image classification [10], natural language processing [11], speech recognition [12] and other fields. The most representative algorithm for deep learning is Convolutional Neural Network (CNN), which can learn the implicit relationships from big data directly to obtain a hierarchical feature representation from low-level visual features to high-level semantic features by simulating the multi-layered cognitive mechanism of the human brain. Compared with the traditional handcrafted features, deep features have a very prominent advantage in extracting multi-level features and context information from the image, and has stronger representative and discriminative capability [13].
Recently, the researchers have applied deep learning to BIQA, and proposed many models [1], [2]. Deep learning based BIQA has become the mainstream work. The existing models can be divided into two categories. The first is to use an end-to-end training manner to obtain the mapping model between the image and the quality score directly [14], [15]. And the second is to adopt a framework of “feature extraction + regression ” [16], [17], that is, extract the deep features of the image first, and then use a regression method to establish a mapping model between the deep features and the quality score. The key of both categories is to use large scale pair-wise data to train the deep CNN networks for extracting the deep features of the image.
In this paper, a BIQA method has been proposed under the framework of “feature extraction + regression”. First, channel attention based ResNet50 [10] is used as the backbone network to extract the deep features of the image [18]. By assigning different weights to the feature channels, the representative capability of the deep features can be improved significantly. In order to reduce the dimensionality of deep features, the LargeVis [19] dimensionality reduction method, which was originally designed for the visualization of big data, is extended to enable it to perform on a single feature vector data, Next, the dimensionality reduced features and quality score of the image form the pair-wise data samples, which are used to train SVR to establish a mapping model between the low-dimensional feature representation and quality score. Extensive experiments demonstrate that the proposed method can achieve superior performance on both synthetic and authentic distortions datasets compared with the state-of-the-art methods.
Section snippets
Related work
BIQA methods can be roughly divided into traditional methods and deep learning-based methods. Traditional methods generally adopt a typical framework of “feature extraction + modeling”, that is, extract the features of the image first, and then use a mathematical method to establish a mapping model between the features and the quality score of the image [20]. One of the most representative work is Distortion Identification-based Image Verity and INtegrity Evaluation(DIIVINE) [21], proposed by
Proposed BIQA method
Based on the framework of “feature extraction + regression”, a BIQA method is proposed in this paper, as shown in Fig. 1. At first, channel attention mechanism is integrated into ResNet50 [10] to enhance the representative capability of deep features by assigning different weights to the feature channels. The constructed network is denoted as CA-ResNet50, which is used as the backbone network to extract the features of the image. The fully connected layer features of CA-ResNet50 are extracted,
Experimental results and analysis
In order to verify the effectiveness of the proposed BIQA method in this paper, extensive experiments are conducted on authentic distortions datasets and synthetic distortions datasets respectively. And it is also compared with state-of-the-art BIQA methods.
Conclusions
Based on a framework of “feature extraction + regression”, we proposed a BIQA method based on the channel attention mechanism and extended LargeVis dimensionality reduction. Channel attention mechanism is embedded into ResNet50 to extract the deep features with powerful representative capability. Then, the extended LargeVis is used to reduce the dimensionality of the deep features. The quality assessment model is finally established by training SVR with the dimensionality reduced
Funding
This work is supported by the National Natural Science Foundation of China (Grant Number 61871006, 61971016), and Beijing Municipal Education Commission Cooperation Beijing Natural Science Foundation (No. KZ201810005002, No. KZ201910005007).
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (52)
- et al.
No-reference image quality assessment in curvelet domain
Signal Process. Image Commun.
(2014) - et al.
Squeeze-and-excitation networks
IEEE Trans. Pattern Anal. Mach. Intell.
(2019) - L. Kang, P. Ye, Y. Li, D. Doermann, Convolutional Neural Networks for No-Reference Image Quality Assessment,...
- et al.
Simultaneous estimation of image quality and distortion via multi-task convolutional neural networks
- et al.
Image quality assessment using similar scene as reference
European Conference on Computer Vision
(2016) - et al.
No-reference image blur assessment based on gradient profile sharpness
- et al.
No-reference quality assessment of JPEG images via a quality relevance map
IEEE Signal Process Lett.
(2014) - et al.
No-reference image quality assessment in the spatial domain
IEEE Trans. Image Process.
(2012) - et al.
A two-step framework for constructing blind image quality indices
IEEE Signal Process Lett.
(2010) - et al.
Blind image quality assessment: A natural scene statistics approach in the DCT domain
IEEE Trans. Image Process.
(2012)
Fast and accurate recurrent neural network acoustic models for speech recognition
Comput. Sci.
Very deep convolutional networks for large-scale image recognition
End-to-end blind image quality assessment using deep neural networks
IEEE Trans. Image Process.
Deep neural networks for no-reference and full-reference image quality assessment
IEEE Trans. Image Process.
On the use of deep learning for blind image quality assessment
Signal, Image Video Process.
DeepRN: A content preserving deep architecture for blind image quality assessment
Visualizing large-scale and high-dimensional data
Automatic prediction of perceptual image and video quality
Proc. IEEE
Blind image quality assessment: from natural scene statistics to perceptual quality
IEEE Trans. Image Process.
Universal blind image quality assessment metrics via natural scene statistics and multiple kernel learning
IEEE Trans. Neural Networks Learn. Syst.
Blind image quality assessment based on multichannel feature fusion and label transfer
IEEE Trans. Circuits Syst. Video Technol.
NMF-based image quality assessment using extreme learning machine
IEEE Trans. Cybern.
Perceptual quality prediction on authentically distorted images using a bag of features approach
J. Vision
Blind image quality assessment using a deep bilinear convolutional neural network
IEEE Transactions on Circuits and Systems for Video Technology
Cited by (4)
No-Reference Video Quality Assessment using novel hybrid features and two-stage hybrid regression for score level fusion
2022, Journal of Visual Communication and Image RepresentationCitation Excerpt :A lack of large-scale training data complicates the design of deep learning-based NR-VQA models, as they require the precise tuning of a large number of trainable parameters, which necessitates massive training data. In the literature, various NR image quality assessment (IQA) techniques [11,19–24] are reported. Video quality can be predicted by employing these benchmark NR-IQA methods for each video frame and subsequently aggregating the frame quality estimates.
Leisure Sports Behavior Recognition Algorithm Based on Deep Residual Network
2022, Scientific Programming
- ☆
This paper has been recommended for acceptance by Zicheng Liu.