Perceptual quality assessment for no-reference image via optimization-based meta-learning
Graphical abstract
Introduction
In recent years, with the popularization of intelligent mobile devices and the advancement of multimedia technology, the application of digital images is becoming more and more widespread in our daily lives. At any stage of the entire media technology chain, digital images will inevitably be degraded and distorted. However, it requires lengthy and expensive subjective experiments to make human observers directly assess digital images’ visual quality. Therefore, objective image quality assessment (IQA) [1] solves this problem by predicting image quality with a specific model which could simulate human visual perception system. Due to the advantages of convenience, speed, stability, and reliability, objective IQA methods have a broad application prospect in many fields, such as image fusion and image restoration [2], [3].
In practical applications, NR-IQA [4], also called blind IQA (BIQA), is more attractive than full-reference (FR) and reduced-reference (RR). However, it is more challenging to use NR-IQA because reference information is often unavailable in many scenarios. In the light of the type of distortion, NR-IQA methods can be categorized into distortion-specific methods and general-purpose methods. Distortion-specific methods assume that digital images have specific noise or artifacts, such as JPEG compression [5], blur/noise [6] and JPEG2000 compression [7]. Since images usually contain various unknown distortion types, increasing number of general-purpose models have been proposed. At the same time, with the breakthrough of machine learning in computer vision, researchers have begun to introduce it into IQA. According to the features made by hand or obtained through learning, these methods do not have to predict the distortion type but characterize the deviation degree of the distorted images. Natural scene statistics (NSS) of images can be applicable to design hand-made features [8], [9]. With the rise of data-driven machine learning research, NR-IQA methods based on learning [10], [11] have become very popular to learn quality perception features from given images.
In recent years, the learning-based image quality metrics (IQMs) have performed well because of the powerful fitting capability of deep neural networks (DNNs) [12], [13]. Although DNNs have proven its adaptability to different learning tasks, a large amount of labeled data and computing resources are necessary for DNNs model training. Since IQA is usually a small sample problem, it has become a challenging task to train deep IQMs from scratch. To resolve this issue, researchers have developed new methods to overcome the need for massive datasets. As shown in Fig. 1(a), some studies employed a transfer learning-based approach using pre-trained convolutional networks to train non-IQA data, such as ImageNet [14]. However, the limitation of these methods lies in the low correlation between image classification and IQA, which requires lots of fine-tuning steps. Moreover, recent researches could also learn NR-IQA models by using information about the relative ranking from distortion specifications [15], FR-IQA models [16] and human data [17]. But the degradation process of these methods was clearly defined, so it only had good performance in synthetic distortions. Therefore, they could not be extended to unknown distortions.
It is difficult to find neural network weights that generalize well from small datasets since many deep learning metrics learn each task independently. In contrast, humans can generalize to the evaluation of images with unknown distortions using high-quality prior knowledge, and do not require large amounts of data. In addition, most IQA models spend a lot of time in the initial training stage [18], while in some practical applications, such as monitoring, real-time processing is a need. Therefore, we consider improving the fast-learning ability of the network. In particular, humans can regard IQA of each known distortion as a task, and utilize few-shot learning (FSL) strategy [19] to achieve rapid adaptation to new tasks. As a promising approach in FSL, meta-learning has been successfully used in various fields of computer vision [20]. For unseen distortion tasks with few instances, meta-learning can get prior knowledge from NR-IQA tasks of various given distortion types, then guide the model to update the parameters accurately and quickly, as shown in Fig. 1(b).
In this paper, we try to solve the problem about insufficient generalization ability of deep IQA models and propose an improved meta-learning framework applied to NR-IQA, as shown in Fig. 2. To this end, combined with an optimization-based meta-learning strategy, the meta-model and optimizer parameters are trained through the test loss obtained from a great deal of synthetic distorted NR-IQA tasks. In this way, the gradient of the model is reduced to the most suitable position for subsequent updates. During training process, the meta-learner guides the network to optimize its weights in each task to adapt to the current task quickly. The meta-learner is then updated with the experience learned from a series of tasks to find a set of potential initialization parameters for new tasks. The contributions of this paper are summarized as follows:
(1)We integrate the deep IQA model with an optimization-based meta-learning strategy into a meta-model. The model can easily handle unseen distortions by leveraging prior knowledge of various tasks without using some hand-designed training techniques, complex network structure design or hyperparameters adjustment. Also, the trained meta-model could learn a general initial representation and gradient direction of the IQA model for rapid parameter updating with a small number of training examples.
(2)We introduce an improved meta-learning metric to make the network learn to learn, which can optimize the initialization parameter and learning rate of the model simultaneously in one meta-learning step instead of setting manually. In this way, the convergence of the model can be accelerated.
(3)Experiments demonstrated that our meta-learning framework has the ability to learn various distortion knowledge and deal with complex real distortions. It indicates that our approach can be extended to any NR-IQA task, which significantly improves the generalization performance.
The rest of this paper is structured as follows. Section 2 briefly introduces the related work of NR-IQA and meta-learning algorithm. In the Section 3, we introduce the meta-learning framework for NR-IQA problem based on optimization in detail. Section 4 gives our experimental results. At last, the conclusion is given in the Section 5.
Section snippets
No-reference image quality assessment
The distortion-specific methods [5], [6], [7] and general-purpose methods [8], [9] are two types of NR-IQA, and the former is not effective for images with multiple mixed distortions. For example, image blur is the key factor affecting image quality. Zhang et al. [21] proposed a no-reference metric combining scale-invariant feature transform (SIFT) and sum of squares of alternating current coefficients (SSAD) to evaluate the degree of blurring. The prediction results are highly consistent with
Our approach
Our work mainly solves the deep IQA problem based on meta-learning. In this section, we describe the details of related technologies and train our meta-model and optimizer from the samples labeled with distortion types through optimization-based metric. Eventually, the meta-learner with relevant experience is allowed to learn the choice of model initialization and optimization rules, which can guide the model to quickly adjust in several new examples.
Datasets
The performance of our method is assessed on synthetic datasets and authentic datasets respectively. The synthetic datasets include TID2013 [43] and Kadid-10k [44]. The former contains 25 original images and has been processed by 24 distortion types at 5 levels. This dataset provides the respective MOS value of the test images, and the value range is [0, 9]. The latter contains 81 original images and 10,125 images processed by 25 distortion types and 5 distortion levels. This dataset provides
Discussion
Data-driven IQA is largely ahead of traditional metrics, but the most typical challenge is the lack of sufficient training samples. Especially in the real world, the labels of distorted images are extremely scarce. Meta-learning can learn the essence of IQA by combining the experience of previous tasks with the ability to deal with unseen tasks quickly. Therefore, the advantages of computer storage are used to train specific distortion tasks to better learn the general knowledge of various
Conclusion
In this paper, we propose a meta-model based on deep neural network and an optimizer with learnable parameters to solve the NR-IQA problem, which can successfully evaluate images with new and unseen distortions. The meta-learning algorithm based on optimization is applied to DNNs to evaluate the perception quality of images, and the learnable parameters are set for the model optimizer to learn general knowledge and appropriate quick adaptation strategies. In the experiment, we first use the
CRediT authorship contribution statement
Longsheng Wei: Data curation, Writing - original draft, Writing - review & editing. Qingqing Yan: Conceptualization, Methodology, Software.Wei Liu: Conceptualization, Methodology, Software. Dapeng Luo: Writing – review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was supported by the Joint Foundation of China Aerospace Science and Industry for Equipment Pre Research 2020, the Opening Fund of Key Laboratory of Geological Survey and Evaluation of Ministry of Education (Grant No. GLAB2020 ZR06) and the Fundamental Research Funds for the Central Universities, and the National Natural Science Foundation of China under contracts (61603357, 61302137).
References (49)
- et al.
Dsagan: A generative adversarial network based on dual-stream attention mechanism for anatomical and functional image fusion
Inf. Sci.
(2021) - et al.
No-reference image quality assessment for contrast-changed images via a semi-supervised robust pca model
Inf. Sci.
(2021) - et al.
Blind quality assessment for image superresolution using deep two-stream convolutional networks
Inf. Sci.
(2020) - et al.
Perceptual image quality assessment: a survey
Sci. China Inf. Sci.
(2020) - et al.
High-quality image restoration using low-rank patch regularization and global structure sparsity
IEEE Trans. Image Process.
(2018) - et al.
No-reference jpeg image quality assessment based on blockiness and luminance change
IEEE Signal Process. Lett.
(2017) - et al.
Effective and fast estimation for image sensor noise via constrained weighted least squares
IEEE Trans. Image Process.
(2018) - et al.
Image quality assessment algorithms for jpeg and jpeg2000 images: A comparative study
- et al.
No-reference image quality assessment in the spatial domain
IEEE Trans. Image Process.
(2012) - et al.
Blind image quality assessment: A natural scene statistics approach in the dct domain
IEEE Trans. Image Process.
(2012)
Unsupervised feature learning framework for no-reference image quality assessment
Blind image quality assessment based on high order statistics aggregation
IEEE Trans. Image Process.
End-to-end blind image quality assessment using deep neural networks
IEEE Trans. Image Process.
Imagenet: A large-scale hierarchical image database
Rankiqa: Learning from rankings for no-reference image quality assessment
Blind image quality assessment by learning from multiple annotators
Learning to rank for blind image quality assessment
Hallucinated-iqa: No-reference image quality assessment via adversarial learning
Generalizing from a few examples: A survey on few-shot learning
ACM Comput. Surv.
Meta-drn: Meta-learning for 1-shot image segmentation
No-reference image blur assessment based on sift and dct
J. Inf. Hiding Multimedia Signal Process.
Unsupervised blind image quality evaluation via statistical measurements of structure, naturalness, and perception
IEEE Trans. Circuits Syst. Video Technol.
Deep convolutional neural models for picture-quality prediction: Challenges and solutions to data-driven image quality assessment
IEEE Signal Process. Mag.
Cited by (6)
Transfer learning for just noticeable difference estimation
2023, Information SciencesMetaWCE: Learning to Weight for Weighted Cluster Ensemble
2023, Information SciencesIntersection-Over-Union Similarity-Based Nonmaximum Suppression for Human Pose Estimation in Crowded Scenes
2024, IEEE Transactions on Cognitive and Developmental SystemsBlind light field image quality assessment based on deep meta-learning
2023, Optics LettersBlind image quality assessment by pairwise ranking image series
2023, China Communications