Lightweight transformer and multi-head prediction network for no-reference image quality assessment

Tang, Zhenjun; Chen, Yihua; Chen, Zhiyuan; Liang, Xiaoping; Zhang, Xianquan

doi:10.1007/s00521-023-09188-3

Lightweight transformer and multi-head prediction network for no-reference image quality assessment

Original Article
Published: 18 November 2023

Volume 36, pages 1931–1946, (2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Zhenjun Tang¹,
Yihua Chen¹,
Zhiyuan Chen¹,
Xiaoping Liang¹ &
…
Xianquan Zhang¹

256 Accesses
Explore all metrics

Abstract

No-reference (NR) image quality assessment (IQA) is an important task of computer vision. Most NR-IQA methods via deep neural networks do not reach desirable IQA performance and have bulky models which make them difficult to be used in the practical scenarios. This paper proposes a lightweight transformer and multi-head prediction network for NR-IQA. The proposed method consists of two lightweight modules: feature extraction and multi-head prediction. The module of feature extraction exploits lightweight transformer blocks to learn features at different scales for measuring different image distortions. The module of multi-head prediction uses three weighted prediction blocks and an FC layer to aggregate the learned features for predicting image quality score. The weighted prediction block can measure the importance of different elements of input feature at the same scale. Since the importance of feature elements at the same scale and the importance of the features at different scales are both considered, the module of multi-head prediction can provide more accurate prediction results. Extensive experiments on the standard IQA datasets are conducted. The results show that the proposed method outperforms some baseline NR-IQA methods in IQA performance on the large image datasets. For the model complexity, the proposed method is also superior to several recent NR-IQA methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-level Fusion Based Deep Convolutional Network for Image Quality Assessment

Dual-Feature Aggregation Network for No-Reference Image Quality Assessment

No-Reference Image Quality Assessment Based on Multi-scale Convolutional Neural Networks

Data availability

The image datasets used to support the findings of this study can be downloaded from the public websites whose hyperlinks are provided in the cited articles.

References

Lu F, Zhao Q, Yang G (2015) A no-reference image quality assessment approach based on steerable pyramid decomposition using natural scene statistics. Neural Comput Appl 26(1):77–90
Article Google Scholar
Chan KY, Lam H-K, Jiang H (2022) A genetic programming-based convolutional neural network for image quality evaluations. Neural Comput Appl 34(18):15409–15427
Article Google Scholar
Liang X, Tang Z, Huang Z, Zhang X, Zhang S (2023) Efficient hashing method using 2d–2d pca for image copy detection. IEEE Trans Knowl Data Eng 35(4):3765–3778
Article Google Scholar
Muthusamy D, Sathyamoorthy S (2022) Deep belief network for solving the image quality assessment in full reference and no reference model. Neural Comput Appl 34(24):21809–21833
Article Google Scholar
Tang Z, Zhang X, Li X, Zhang S (2016) Robust image hashing with ring partition and invariant vector distance. IEEE Trans Inf Foren Secur 11(1):200–214
Article Google Scholar
Zheng H, Yang H, Fu J, Zha Z-J, Luo J (2021) Learning conditional knowledge distillation for degraded-reference image quality assessment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2021), pp. 10222–10231
Tang Z, Huang Z, Yao H, Zhang X, Chen L, Yu C (2018) Perceptual image hashing with weighted dwt features for reduced-reference image quality assessment. Comput J 61:1695–1709
Article Google Scholar
Chen Z, Che Y, Liang X, Tang Z (2022) Multi-level feature aggregaton network for full-reference image quality assessment. In: Proceedings of the 34th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2022), pp. 861–866
Chen Y, Chen Z, Yu M, Tang Z (2023) Dual-feature aggregation network for no-reference image quality assessment. In: Proceedings of the 29th International Conference on MultiMedia Modeling (MMM 2023), pp. 149–161
Saad MA, Bovik AC, Charrier C (2012) Blind image quality assessment: a natural scene statistics approach in the dct domain. IEEE Trans Image Process 21(8):3339–3352
Article MathSciNet Google Scholar
Mittal A, Moorthy AK, Bovik AC (2012) No-reference image quality assessment in the spatial domain. IEEE Trans Image Process 21(12):4695–4708
Article MathSciNet Google Scholar
Zhang L, Zhang L, Bovik AC (2015) A feature-enriched completely blind image quality evaluator. IEEE Trans Image Process 24(8):2579–2591
Article MathSciNet Google Scholar
Kang L, Ye P, Li Y, Doermann D (2014) Convolutional neural networks for no-reference image quality assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2014), pp. 1733–1740
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: Transformers for image recognition at scale. In: Proceedings of the International Conference on Learning Representations(ICLR 2021), pp. 1–22
Ferzli R, Karam LJ (2009) A no-reference objective image sharpness metric based on the notion of just noticeable blur (jnb). IEEE Trans Image Process 18(4):717–728
Article MathSciNet Google Scholar
Sheikh HR, Bovik AC, Cormack L (2005) No-reference quality assessment using natural scene statistics: Jpeg 2000. IEEE Trans Image Process 14(11):1918–1927
Article Google Scholar
Suthaharan S (2009) No-reference visually significant blocking artifact metric for natural scene images. Signal Process 89:1647–1652
Article Google Scholar
Kim J, Lee S (2017) Fully deep blind image quality predictor. IEEE J Sel Topics Signal Process 11(1):206–220
Article Google Scholar
Ma K, Liu W, Zhang K, Duanmu Z, Wang Z, Zuo W (2018) End-to-end blind image quality assessment using deep neural networks. IEEE Trans Image Process 27(3):1202–1213
Article MathSciNet Google Scholar
Mittal A, Soundararajan R, Bovik AC (2013) Making a completely blindimage quality analyzer. IEEE Signal Process Lett 20(3):209–212
Article Google Scholar
Xue W, Mou X, Zhang L, Bovik AC, Feng X (2014) Blind image quality assessment using joint statistics of gradient magnitude and laplacian features. IEEE Trans Image Process 23(11):4850–4862
Article MathSciNet Google Scholar
Li Q, Lin W, Xu J, Fang Y (2016) Blind image quality assessment using statistical structural and luminance features. IEEE Trans Multim 18(12):2457–2469
Article Google Scholar
Feng P, Tang Z (2022) A survey of visual neural networks: current trends, challenges and opportunities. Multim Syst 29(2):693–724
Article MathSciNet Google Scholar
Huang H, Zeng H, Tian Y, Chen J, Zhu J, Ma K-K (2020) Light field image quality assessment: An overview. In: Proceedings of the IEEE Conference on Multimedia Information Processing and Retrieval (MIPR 2020), pp. 348–353
Niu Y, Zhong Y, Guo W, Shi Y, Chen P (2019) 2d and 3d image quality assessment: a survey of metrics and challenges. IEEE Access 7:782–801
Article Google Scholar
Yang X, Li F, Liu H (2021) Ttl-iqa: Transitive transfer learning based no-reference image quality assessment. IEEE Trans Multim 23:4326–4340
Article Google Scholar
Golestaneh SA, Dadsetan S, Kitani KM (2022) No-reference image quality assessment via transformers, relative ranking, and self-consistency. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2022), pp. 3989–3999
Li F, Zhang Y, Cosman PC (2021) Mmmnet: An end-to-end multi-task deep convolution neural network with multi-scale and multi-hierarchy fusion for blind image quality assessment. IEEE Trans Circuits Syst Video Technol 31(12):4798–4811
Article Google Scholar
Ma J, Wu J, Li L, Dong W, Xie X, Shi G, Lin W (2021) Blind image quality assessment with active inference. IEEE Trans Image Process 30:3650–3663
Article Google Scholar
Pan Z, Yuan F, Lei J, Fang Y, Shao X, Kwong S (2022) Vcrnet: Visual compensation restoration network for no-reference image quality assessment. IEEE Trans Image Process 31:1613–1627
Article Google Scholar
Zhou Z, Xu Y, Quan Y, Xu R (2022) Deep blind image quality assessment using dual-order statistics. In: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME 2022), pp. 1–6
Pan Z, Yuan F, Wang X, Xu L, Shao X, Kwong S (2023) No-reference image quality assessment via multibranch convolutional neural networks. IEEE Trans Artif Intell 4(1):148–160
Article Google Scholar
Tan M, Le QV (2019) Efficientnet: Rethinking model scaling for convolutional neural networks. In: Proceedings of the 36th International Conference on Machine Learning (ICML2019), pp. 10691–10700
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015), pp. 1–14
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018), pp. 4510–4520
Howard A, Sandler M, Chen B, Wang W, Chen L-C, Tan M, Chu G, Vasudevan V, Zhu Y, Pang R, Adam H, Le Q (2019) Searching for mobilenetv3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2019), pp. 1314–1324
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861
Shi W, Caballero J, Huszár F, Totz J, Aitken AP, Bishop R, Rueckert D, Wang Z (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2016), pp. 1874–1883
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Proceedings of the Medical Image Computing and Computer-Assisted Intervention (MICCAI 2015), pp. 234–241
Yang S, Wu T, Shi S, Lao S, Gong Y, Cao M, Wang J, Yang Y (2022) Maniqa: Multi-dimension attention network for no-reference image quality assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1190–1199
Sheikh HR, Sabir MF, Bovik AC (2006) A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans Image Process 15(11):3440–3451
Article Google Scholar
Larson EC, Chandler DM (2010) Most apparent distortion: full-reference image quality assessment and the role of strategy. J Electron Imag 19(1):011006
Article Google Scholar
Ponomarenko N, Ieremeiev O, Lukin V, Egiazarian K, Jin L, Astola J, Vozel B, Chehdi K, Carli M, Battisti F, Kuo C-CJ (2013) Color image database tid2013: Peculiarities and preliminary results. In: Proceedings of the European Workshop on Visual Information Processing (EUVIP 2013), pp. 106–111
Lin H, Hosu V, Saupe D (2019) Kadid-10k: A large-scale artificially distorted iqa database. In: Proceedings of the 11th International Conference on Quality of Multimedia Experience (QoMEX 2019), pp. 1–3
Hosu V, Lin H, Sziranyi T, Saupe D (2020) Koniq-10k: an ecologically valid database for deep learning of blind image quality assessment. IEEE Trans Image Process 29:4041–4056
Article Google Scholar
Su S, Yan Q, Zhu Y, Zhang C, Ge X, Sun J, Zhang Y (2020) Blindly assess image quality in the wild guided by a self-adaptive hyper network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), pp. 3664–3673
Yu M, Tang Z, Zhang X, Zhong B, Zhang X (2022) Perceptual hashing with complementary color wavelet transform and compressed sensing for reduced-reference image quality assessment. IEEE Trans Circuits Syst Video Technol 32(11):7559–7574
Article Google Scholar
Ye P, Kumar J, Kang L, Doermann D (2012) Unsupervised feature learning framework for no-reference image quality assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2012), pp. 1098–1105
Xu J, Ye P, Li Q, Du H, Liu Y, Doermann D (2016) Blind image quality assessment based on high order statistics aggregation. IEEE Trans Image Process 25(9):4444–4457
Article MathSciNet Google Scholar
Bosse S, Maniry D, Müller K-R, Wiegand T, Samek W (2018) Deep neural networks for no-reference and full-reference image quality assessment. IEEE Trans Image Process 27(1):206–219
Article MathSciNet Google Scholar
Zhang W, Ma K, Yan J, Deng D, Wang Z (2020) Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Trans Circuits Syst Video Technol 30(1):36–47
Article Google Scholar
Zhu H, Li L, Wu J, Dong W, Shi G (2020) Metaiqa: Deep meta-learning for no-reference image quality assessment. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), pp. 14131–14140
Ying Z, Niu H, Gupta P, Mahajan D, Ghadiyaram D, Bovik A (2020) From patches to pictures (paq-2-piq): Mapping the perceptual space of picture quality. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), pp. 3572–3582
Otroshi-Shahreza H, Amini A, Behroozi H (2018) No-reference image quality assessment using transfer learning. In: Proceedings of the 9th International Symposium on Telecommunications (IST 2018), pp. 637–640
Dendi SVR, Dev C, Kothari N, Channappayya SS (2019) Generating image distortion maps using convolutional autoencoders with application to no reference image quality assessment. IEEE Signal Processing Letters 26(1):89–93
Article Google Scholar
Zhou Z, Li J, Xu Y, Quan Y (2020) Full-reference image quality metric for blurry images and compressed images using hybrid dictionary learning. Neural Comput Appl 32:12403–12415
Article Google Scholar
Tang Z, Chen Z, Li Z, Zhong B, Zhang X, Zhang X (2023) Unifying dual-attention and siamese transformer network for full-reference image quality assessment. ACM Trans Multim Comput Commun Appl 19(6):1–24
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their helpful comments and suggestions.

Funding

This work was partially supported by the Guangxi Natural Science Foundation (2022GXNSFAA035506), the Project of Guangxi Science and Technology (GuiKeAB23026040), the National Natural Science Foundation of China (62272111, 61962008, 62302108, 62062013), the Guangxi “Bagui Scholar” Team for Innovation and Research, the Guangxi Talent Highland Project of Big Data Intelligence and Application, the Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing, and the Innovation Project of Guangxi Graduate Education (YCSW2023131).

Author information

Authors and Affiliations

Key Lab of Education Blockchain and Intelligent Technology, Ministry of Education, Guangxi Normal University, Guilin, 541004, China
Zhenjun Tang, Yihua Chen, Zhiyuan Chen, Xiaoping Liang & Xianquan Zhang

Authors

Zhenjun Tang
View author publications
You can also search for this author in PubMed Google Scholar
Yihua Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyuan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoping Liang
View author publications
You can also search for this author in PubMed Google Scholar
Xianquan Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhenjun Tang.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tang, Z., Chen, Y., Chen, Z. et al. Lightweight transformer and multi-head prediction network for no-reference image quality assessment. Neural Comput & Applic 36, 1931–1946 (2024). https://doi.org/10.1007/s00521-023-09188-3

Download citation

Received: 20 March 2023
Accepted: 20 October 2023
Published: 18 November 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s00521-023-09188-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lightweight transformer and multi-head prediction network for no-reference image quality assessment

Abstract

Access this article

Similar content being viewed by others

Multi-level Fusion Based Deep Convolutional Network for Image Quality Assessment

Dual-Feature Aggregation Network for No-Reference Image Quality Assessment

No-Reference Image Quality Assessment Based on Multi-scale Convolutional Neural Networks

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Lightweight transformer and multi-head prediction network for no-reference image quality assessment

Abstract

Access this article

Similar content being viewed by others

Multi-level Fusion Based Deep Convolutional Network for Image Quality Assessment

Dual-Feature Aggregation Network for No-Reference Image Quality Assessment

No-Reference Image Quality Assessment Based on Multi-scale Convolutional Neural Networks

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation