research-article

Align-IQA: Aligning Image Quality Assessment Models with Diverse Human Preferences via Customizable Guidance

Authors:

Jing Fu,

Qin Li,

Wenzhi CaoAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 10008 - 10017

https://doi.org/10.1145/3664647.3681511

Published: 28 October 2024 Publication History

Get Access

Abstract

The alignment of Image Quality Assessment (IQA) models with diverse human preferences remains a challenge, owing to the variability in preferences for different types of visual content, including user-generated content and AI-Generated Content (AIGC), etc. Despite the significant success of existing IQA methods in assessing specific visual content by leveraging knowledge from pre-trained models, the intricate factors impacting final ratings and the specially designed network architecture of these methods result in gaps in their ability to accurately capture human preferences for novel visual content. To address this issue, we propose Align-IQA, a novel framework that aims to generate visual quality scores aligned with diverse human preferences for various types of visual content. Align-IQA contains two key designs: (1) A customizable quality-aware guidance injection module. By injecting specializable quality-aware prior knowledge into general-purpose pre-trained models, the proposed module guides the acquisition of quality-aware features and allows for various adjustments of features to be consistent with diverse human preferences for different types of visual content. (2) A multi-scale feature aggregation module. By simulating the multi-scale mechanism in the human visual system, the proposed module enables the extraction of a more comprehensive representation of quality-aware features from the human perception perspective. Extensive experimental results demonstrate that Align-IQA achieves better or comparable performance to State-Of-The-Art (SOTA) methods. Notably, Align-IQA outperforms the previous best results on AIGC datasets, achieving Pearson's Linear Correlation Coefficients (PLCCs) of 0.890 (+3.73%) on AGIQA-1K and 0.924 (+1.99%) on AGIQA-3K. Additionally, Align-IQA reduces training parameters by 72.26% and inference overhead by 78.12%, while maintaining SOTA performance.

References

[1]

Simone Bianco, Luigi Celona, Paolo Napoletano, and Raimondo Schettini. 2018. On the use of deep learning for blind image quality assessment. Signal, Image and Video Processing, Vol. 12 (2018), 355--362.

Abstract

References

Cited By

Index Terms

Recommendations

Image quality assessment metrics combining structural similarity and image fidelity with visual attention

An image quality assessment algorithm based on saliency and sparsity

No-reference image quality assessment in contourlet domain

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations