Perceptual quality assessment for no-reference image via optimization-based meta-learning

doi:10.1016/j.ins.2022.07.163

Information Sciences

Volume 611, September 2022, Pages 30-46

https://doi.org/10.1016/j.ins.2022.07.163 Get rights and content

Highlights

•
We integrate the image quality assessment with meta-learning strategy.
•
This model optimizes initialization parameter and learning rate simultaneously.
•
Our framework can learn various distortion knowledge and complex real distortions.

Abstract

Image quality assessment (IQA) is a critical issue in computer vision, which intends to simulate human visual system (HVS) with a view to evaluating the error degree of distorted images from reference images. The algorithms based on deep learning have been successfully introduced to quality evaluation of no-referenced images in the last years. Unfortunately, they are confronted with the problems of over-fitting and weak generalization ability caused by insufficient labeled data. The emergence of meta-learning has brought new ideas, which have been proved to address the issues regarding few-shot learning. However, the commonly used meta-learning metrics only learn the initialization of weights, which can’t guarantee the optimal gradient direction. The manual design mode leads to lower accuracy and speed. In our work, an improved meta-learning framework is applied to no-reference (NR) IQA to meet the above challenges. It can achieve maximum generalization performance through only a few update iterations. Specifically, we collected a great many NR-IQA tasks with different distortions to pre-train meta-model and optimizer to learn a general weight initialization and optimization rule. Then, the meta-model acquired meta-knowledge and learned unique learning rate for each task. Finally, it could be directly adapted to new NR-IQA tasks only by fine-tuning a few images. Experiments on synthetic and authentic datasets proved that our approach has more vital learning ability and better generalization performance for evaluating real distorted images, effectively reducing dependence on manual marking.

Graphical abstract

Introduction

In recent years, with the popularization of intelligent mobile devices and the advancement of multimedia technology, the application of digital images is becoming more and more widespread in our daily lives. At any stage of the entire media technology chain, digital images will inevitably be degraded and distorted. However, it requires lengthy and expensive subjective experiments to make human observers directly assess digital images’ visual quality. Therefore, objective image quality assessment (IQA) [1] solves this problem by predicting image quality with a specific model which could simulate human visual perception system. Due to the advantages of convenience, speed, stability, and reliability, objective IQA methods have a broad application prospect in many fields, such as image fusion and image restoration [2], [3].

In practical applications, NR-IQA [4], also called blind IQA (BIQA), is more attractive than full-reference (FR) and reduced-reference (RR). However, it is more challenging to use NR-IQA because reference information is often unavailable in many scenarios. In the light of the type of distortion, NR-IQA methods can be categorized into distortion-specific methods and general-purpose methods. Distortion-specific methods assume that digital images have specific noise or artifacts, such as JPEG compression [5], blur/noise [6] and JPEG2000 compression [7]. Since images usually contain various unknown distortion types, increasing number of general-purpose models have been proposed. At the same time, with the breakthrough of machine learning in computer vision, researchers have begun to introduce it into IQA. According to the features made by hand or obtained through learning, these methods do not have to predict the distortion type but characterize the deviation degree of the distorted images. Natural scene statistics (NSS) of images can be applicable to design hand-made features [8], [9]. With the rise of data-driven machine learning research, NR-IQA methods based on learning [10], [11] have become very popular to learn quality perception features from given images.

In recent years, the learning-based image quality metrics (IQMs) have performed well because of the powerful fitting capability of deep neural networks (DNNs) [12], [13]. Although DNNs have proven its adaptability to different learning tasks, a large amount of labeled data and computing resources are necessary for DNNs model training. Since IQA is usually a small sample problem, it has become a challenging task to train deep IQMs from scratch. To resolve this issue, researchers have developed new methods to overcome the need for massive datasets. As shown in Fig. 1(a), some studies employed a transfer learning-based approach using pre-trained convolutional networks to train non-IQA data, such as ImageNet [14]. However, the limitation of these methods lies in the low correlation between image classification and IQA, which requires lots of fine-tuning steps. Moreover, recent researches could also learn NR-IQA models by using information about the relative ranking from distortion specifications [15], FR-IQA models [16] and human data [17]. But the degradation process of these methods was clearly defined, so it only had good performance in synthetic distortions. Therefore, they could not be extended to unknown distortions.

It is difficult to find neural network weights that generalize well from small datasets since many deep learning metrics learn each task independently. In contrast, humans can generalize to the evaluation of images with unknown distortions using high-quality prior knowledge, and do not require large amounts of data. In addition, most IQA models spend a lot of time in the initial training stage [18], while in some practical applications, such as monitoring, real-time processing is a need. Therefore, we consider improving the fast-learning ability of the network. In particular, humans can regard IQA of each known distortion as a task, and utilize few-shot learning (FSL) strategy [19] to achieve rapid adaptation to new tasks. As a promising approach in FSL, meta-learning has been successfully used in various fields of computer vision [20]. For unseen distortion tasks with few instances, meta-learning can get prior knowledge from NR-IQA tasks of various given distortion types, then guide the model to update the parameters accurately and quickly, as shown in Fig. 1(b).

In this paper, we try to solve the problem about insufficient generalization ability of deep IQA models and propose an improved meta-learning framework applied to NR-IQA, as shown in Fig. 2. To this end, combined with an optimization-based meta-learning strategy, the meta-model and optimizer parameters are trained through the test loss obtained from a great deal of synthetic distorted NR-IQA tasks. In this way, the gradient of the model is reduced to the most suitable position for subsequent updates. During training process, the meta-learner guides the network to optimize its weights in each task to adapt to the current task quickly. The meta-learner is then updated with the experience learned from a series of tasks to find a set of potential initialization parameters for new tasks. The contributions of this paper are summarized as follows:

(1)We integrate the deep IQA model with an optimization-based meta-learning strategy into a meta-model. The model can easily handle unseen distortions by leveraging prior knowledge of various tasks without using some hand-designed training techniques, complex network structure design or hyperparameters adjustment. Also, the trained meta-model could learn a general initial representation and gradient direction of the IQA model for rapid parameter updating with a small number of training examples.
(2)We introduce an improved meta-learning metric to make the network learn to learn, which can optimize the initialization parameter and learning rate of the model simultaneously in one meta-learning step instead of setting manually. In this way, the convergence of the model can be accelerated.
(3)Experiments demonstrated that our meta-learning framework has the ability to learn various distortion knowledge and deal with complex real distortions. It indicates that our approach can be extended to any NR-IQA task, which significantly improves the generalization performance.

The rest of this paper is structured as follows. Section 2 briefly introduces the related work of NR-IQA and meta-learning algorithm. In the Section 3, we introduce the meta-learning framework for NR-IQA problem based on optimization in detail. Section 4 gives our experimental results. At last, the conclusion is given in the Section 5.

Section snippets

No-reference image quality assessment

The distortion-specific methods [5], [6], [7] and general-purpose methods [8], [9] are two types of NR-IQA, and the former is not effective for images with multiple mixed distortions. For example, image blur is the key factor affecting image quality. Zhang et al. [21] proposed a no-reference metric combining scale-invariant feature transform (SIFT) and sum of squares of alternating current coefficients (SSAD) to evaluate the degree of blurring. The prediction results are highly consistent with

Our approach

Our work mainly solves the deep IQA problem based on meta-learning. In this section, we describe the details of related technologies and train our meta-model and optimizer from the samples labeled with distortion types through optimization-based metric. Eventually, the meta-learner with relevant experience is allowed to learn the choice of model initialization and optimization rules, which can guide the model to quickly adjust in several new examples.

Datasets

The performance of our method is assessed on synthetic datasets and authentic datasets respectively. The synthetic datasets include TID2013 [43] and Kadid-10k [44]. The former contains 25 original images and has been processed by 24 distortion types at 5 levels. This dataset provides the respective MOS value of the test images, and the value range is [0, 9]. The latter contains 81 original images and 10,125 images processed by 25 distortion types and 5 distortion levels. This dataset provides

Discussion

Data-driven IQA is largely ahead of traditional metrics, but the most typical challenge is the lack of sufficient training samples. Especially in the real world, the labels of distorted images are extremely scarce. Meta-learning can learn the essence of IQA by combining the experience of previous tasks with the ability to deal with unseen tasks quickly. Therefore, the advantages of computer storage are used to train specific distortion tasks to better learn the general knowledge of various

Conclusion

In this paper, we propose a meta-model based on deep neural network and an optimizer with learnable parameters to solve the NR-IQA problem, which can successfully evaluate images with new and unseen distortions. The meta-learning algorithm based on optimization is applied to DNNs to evaluate the perception quality of images, and the learnable parameters are set for the model optimizer to learn general knowledge and appropriate quick adaptation strategies. In the experiment, we first use the

CRediT authorship contribution statement

Longsheng Wei: Data curation, Writing - original draft, Writing - review & editing. Qingqing Yan: Conceptualization, Methodology, Software.Wei Liu: Conceptualization, Methodology, Software. Dapeng Luo: Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by the Joint Foundation of China Aerospace Science and Industry for Equipment Pre Research 2020, the Opening Fund of Key Laboratory of Geological Survey and Evaluation of Ministry of Education (Grant No. GLAB2020 ZR06) and the Fundamental Research Funds for the Central Universities, and the National Natural Science Foundation of China under contracts (61603357, 61302137).

References (49)

Jun Fu et al.
Dsagan: A generative adversarial network based on dual-stream attention mechanism for anatomical and functional image fusion
Inf. Sci.
(2021)
Jingchao Cao et al.
No-reference image quality assessment for contrast-changed images via a semi-supervised robust pca model
Inf. Sci.
(2021)
Wei Zhou et al.
Blind quality assessment for image superresolution using deep two-stream convolutional networks
Inf. Sci.
(2020)
Guangtao Zhai et al.
Perceptual image quality assessment: a survey
Sci. China Inf. Sci.
(2020)
Mingli Zhang et al.
High-quality image restoration using low-rank patch regularization and global structure sparsity
IEEE Trans. Image Process.
(2018)
Yibing Zhan et al.
No-reference jpeg image quality assessment based on blockiness and luminance change
IEEE Signal Process. Lett.
(2017)
Li Dong et al.
Effective and fast estimation for image sensor noise via constrained weighted least squares
IEEE Trans. Image Process.
(2018)
Md Amir Baig et al.
Image quality assessment algorithms for jpeg and jpeg2000 images: A comparative study
Anish Mittal et al.
No-reference image quality assessment in the spatial domain
IEEE Trans. Image Process.
(2012)
Michele A. Saad et al.
Blind image quality assessment: A natural scene statistics approach in the dct domain
IEEE Trans. Image Process.
(2012)

Peng Ye et al.

Unsupervised feature learning framework for no-reference image quality assessment

Xu. Jingtao et al.

Blind image quality assessment based on high order statistics aggregation

IEEE Trans. Image Process.

(2016)

Kede Ma et al.

End-to-end blind image quality assessment using deep neural networks

IEEE Trans. Image Process.

(2017)

Jia Deng et al.

Imagenet: A large-scale hierarchical image database

Xialei Liu et al.

Rankiqa: Learning from rankings for no-reference image quality assessment

Kede Ma et al.

Blind image quality assessment by learning from multiple annotators

Lihao Zheng et al.

Learning to rank for blind image quality assessment

Kwan-Yee Lin et al.

Hallucinated-iqa: No-reference image quality assessment via adversarial learning

Yaqing Wang et al.

Generalizing from a few examples: A survey on few-shot learning

ACM Comput. Surv.

(2020)

Atmadeep Banerjee

Meta-drn: Meta-learning for 1-shot image segmentation

Shan-Qing Zhang et al.

No-reference image blur assessment based on sift and dct

J. Inf. Hiding Multimedia Signal Process.

(2018)

Yutao Liu et al.

Unsupervised blind image quality evaluation via statistical measurements of structure, naturalness, and perception

IEEE Trans. Circuits Syst. Video Technol.

(2019)

Guangcheng Wang, Feng Zhu, Zhaolin Lu, Xiaoping Yuan, L.D. Li, No-reference quality assessment of super-resolution...

Jongyoo Kim et al.

Deep convolutional neural models for picture-quality prediction: Challenges and solutions to data-driven image quality assessment

IEEE Signal Process. Mag.

(2017)

Cited by (6)

Blind image quality assessment based on hierarchical dependency learning and quality aggregation
2024, Neurocomputing
Image quality assessment (IQA) aims to build a quality prediction model to assess image quality automatically rather than artificially. Due to a lack of reference images, blind image quality assessment (BIQA) has become an attractive yet challenging research topic. Inspired by the hierarchical perception mechanism in the human visual system, some existing BIQA methods aggregate multi-stage features of a convolutional neural network (CNN). However, they are regardless of the latent dependencies. To solve this problem, we propose a novel BIQA method based on hierarchical dependency learning and quality aggregation (HDLaQA). The proposed method includes multi-stage feature extraction, hierarchical dependency learning, and quality aggregation. In multi-stage feature extraction, a CNN is used as the feature extractor and multi-stage features are output for further learning. In hierarchical dependency learning, spatial and channel dependencies among the multi-stage features are modeled. To this end, a dual-head spatial dependency (DSD) module is designed to harvest the spatial dependencies between the adjacent-stage features and deliver these dependencies to the next stage. Moreover, exponential bilinear pooling (EBP) is presented to learn the channel dependencies, which is more stable than commonly used BP. In quality aggregation, multiple quality scores are predicted based on the learned dependencies, and multiple learnable weights are used to measure the importance of the predicted scores for final quality evaluation. Experimental results on seven IQA databases demonstrate the competitiveness of the proposed method on both synthetic and authentic distortions.
Transfer learning for just noticeable difference estimation
2023, Information Sciences
The just noticeable difference (JND) measures the visual redundancy of digital images and is widely used in signal processing. Conventional JND models attempt to simulate functional properties of the human visual system (HVS), which are limited by the development of cognitive psychology. In this paper, we propose a novel pixel-wise JND prediction model based on deep transfer learning. Since it is almost impossible to manually label each pixel's visibility threshold, lacking labeled training data is the crucial issue. Transfer learning addresses the problem of insufficient training data. We found an underlying correspondence between full reference image quality assessment (FR-IQA) and JND estimation, which implies that knowledge related to FR-IQA can be applied to JND estimation. To quantify the intrinsic association between JND estimation and FR-IQA, a local perceived discrepancy (LPD) index is deduced. With the guidance of the LPD index, a JND predictor based on residual dense network (RDN) is designed to discover good representations of visibility limitation from annotated image quality databases. Subjective viewing test experiments show that our model outperforms the state-of-the-art JND models. Furthermore, we apply our model to image compression, and around 14.42% of the bit rate can be reduced by removing visual redundancy.
MetaWCE: Learning to Weight for Weighted Cluster Ensemble
2023, Information Sciences
Cluster ensemble (CE) integrates multiple clustering solutions to effectively improve the accuracy and robustness of unsupervised clustering. To reduce the impacts of low-quality solutions, existing CE methods often design heuristic criteria to appraise these clustering solutions and allocate weights for them. However, such heuristic-based weighting methods rely on human experience and lack knowledge of the relation between weights and data characteristics, failing to adaptively adjust weights for various datasets. To address this, we propose Meta-learning-based Weighted Cluster Ensemble (MetaWCE), which learns the weights-data relation automatically and sets adaptive CE weights. Specifically, metadata is employed to describe data characteristics at a dataset level. To bridge metadata and weights, a meta-learning strategy is introduced to simulate the weighting process to ensure that relation between weights and metadata can be learned to directly optimize the ensemble performance in an end-to-end manner. Experiments on three datasets indicate that MetaWCE significantly improves ensemble performance and achieves obvious improvements over strong baseline methods.
Intersection-Over-Union Similarity-Based Nonmaximum Suppression for Human Pose Estimation in Crowded Scenes
2024, IEEE Transactions on Cognitive and Developmental Systems
Blind light field image quality assessment based on deep meta-learning
2023, Optics Letters
Blind image quality assessment by pairwise ranking image series
2023, China Communications

View full text

Perceptual quality assessment for no-reference image via optimization-based meta-learning

Highlights

Abstract

Graphical abstract

Introduction

Section snippets

No-reference image quality assessment

Our approach

Datasets

Discussion

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Inf. Sci.

Inf. Sci.

Inf. Sci.

Perceptual image quality assessment: a survey

Sci. China Inf. Sci.

High-quality image restoration using low-rank patch regularization and global structure sparsity

IEEE Trans. Image Process.

No-reference jpeg image quality assessment based on blockiness and luminance change

IEEE Signal Process. Lett.

Effective and fast estimation for image sensor noise via constrained weighted least squares

IEEE Trans. Image Process.

Image quality assessment algorithms for jpeg and jpeg2000 images: A comparative study

No-reference image quality assessment in the spatial domain

IEEE Trans. Image Process.

Blind image quality assessment: A natural scene statistics approach in the dct domain

IEEE Trans. Image Process.

Unsupervised feature learning framework for no-reference image quality assessment

Blind image quality assessment based on high order statistics aggregation

IEEE Trans. Image Process.

End-to-end blind image quality assessment using deep neural networks

IEEE Trans. Image Process.

Imagenet: A large-scale hierarchical image database

Rankiqa: Learning from rankings for no-reference image quality assessment

Blind image quality assessment by learning from multiple annotators

Learning to rank for blind image quality assessment

Hallucinated-iqa: No-reference image quality assessment via adversarial learning

Generalizing from a few examples: A survey on few-shot learning

ACM Comput. Surv.

Meta-drn: Meta-learning for 1-shot image segmentation

No-reference image blur assessment based on sift and dct

J. Inf. Hiding Multimedia Signal Process.

Unsupervised blind image quality evaluation via statistical measurements of structure, naturalness, and perception

IEEE Trans. Circuits Syst. Video Technol.

Deep convolutional neural models for picture-quality prediction: Challenges and solutions to data-driven image quality assessment

IEEE Signal Process. Mag.