Skip to main content
Log in

Lightweight transformer and multi-head prediction network for no-reference image quality assessment

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

No-reference (NR) image quality assessment (IQA) is an important task of computer vision. Most NR-IQA methods via deep neural networks do not reach desirable IQA performance and have bulky models which make them difficult to be used in the practical scenarios. This paper proposes a lightweight transformer and multi-head prediction network for NR-IQA. The proposed method consists of two lightweight modules: feature extraction and multi-head prediction. The module of feature extraction exploits lightweight transformer blocks to learn features at different scales for measuring different image distortions. The module of multi-head prediction uses three weighted prediction blocks and an FC layer to aggregate the learned features for predicting image quality score. The weighted prediction block can measure the importance of different elements of input feature at the same scale. Since the importance of feature elements at the same scale and the importance of the features at different scales are both considered, the module of multi-head prediction can provide more accurate prediction results. Extensive experiments on the standard IQA datasets are conducted. The results show that the proposed method outperforms some baseline NR-IQA methods in IQA performance on the large image datasets. For the model complexity, the proposed method is also superior to several recent NR-IQA methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The image datasets used to support the findings of this study can be downloaded from the public websites whose hyperlinks are provided in the cited articles.

References

  1. Lu F, Zhao Q, Yang G (2015) A no-reference image quality assessment approach based on steerable pyramid decomposition using natural scene statistics. Neural Comput Appl 26(1):77–90

    Article  Google Scholar 

  2. Chan KY, Lam H-K, Jiang H (2022) A genetic programming-based convolutional neural network for image quality evaluations. Neural Comput Appl 34(18):15409–15427

    Article  Google Scholar 

  3. Liang X, Tang Z, Huang Z, Zhang X, Zhang S (2023) Efficient hashing method using 2d–2d pca for image copy detection. IEEE Trans Knowl Data Eng 35(4):3765–3778

    Article  Google Scholar 

  4. Muthusamy D, Sathyamoorthy S (2022) Deep belief network for solving the image quality assessment in full reference and no reference model. Neural Comput Appl 34(24):21809–21833

    Article  Google Scholar 

  5. Tang Z, Zhang X, Li X, Zhang S (2016) Robust image hashing with ring partition and invariant vector distance. IEEE Trans Inf Foren Secur 11(1):200–214

    Article  Google Scholar 

  6. Zheng H, Yang H, Fu J, Zha Z-J, Luo J (2021) Learning conditional knowledge distillation for degraded-reference image quality assessment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2021), pp. 10222–10231

  7. Tang Z, Huang Z, Yao H, Zhang X, Chen L, Yu C (2018) Perceptual image hashing with weighted dwt features for reduced-reference image quality assessment. Comput J 61:1695–1709

    Article  Google Scholar 

  8. Chen Z, Che Y, Liang X, Tang Z (2022) Multi-level feature aggregaton network for full-reference image quality assessment. In: Proceedings of the 34th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2022), pp. 861–866

  9. Chen Y, Chen Z, Yu M, Tang Z (2023) Dual-feature aggregation network for no-reference image quality assessment. In: Proceedings of the 29th International Conference on MultiMedia Modeling (MMM 2023), pp. 149–161

  10. Saad MA, Bovik AC, Charrier C (2012) Blind image quality assessment: a natural scene statistics approach in the dct domain. IEEE Trans Image Process 21(8):3339–3352

    Article  MathSciNet  Google Scholar 

  11. Mittal A, Moorthy AK, Bovik AC (2012) No-reference image quality assessment in the spatial domain. IEEE Trans Image Process 21(12):4695–4708

    Article  MathSciNet  Google Scholar 

  12. Zhang L, Zhang L, Bovik AC (2015) A feature-enriched completely blind image quality evaluator. IEEE Trans Image Process 24(8):2579–2591

    Article  MathSciNet  Google Scholar 

  13. Kang L, Ye P, Li Y, Doermann D (2014) Convolutional neural networks for no-reference image quality assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2014), pp. 1733–1740

  14. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: Transformers for image recognition at scale. In: Proceedings of the International Conference on Learning Representations(ICLR 2021), pp. 1–22

  15. Ferzli R, Karam LJ (2009) A no-reference objective image sharpness metric based on the notion of just noticeable blur (jnb). IEEE Trans Image Process 18(4):717–728

    Article  MathSciNet  Google Scholar 

  16. Sheikh HR, Bovik AC, Cormack L (2005) No-reference quality assessment using natural scene statistics: Jpeg 2000. IEEE Trans Image Process 14(11):1918–1927

    Article  Google Scholar 

  17. Suthaharan S (2009) No-reference visually significant blocking artifact metric for natural scene images. Signal Process 89:1647–1652

    Article  Google Scholar 

  18. Kim J, Lee S (2017) Fully deep blind image quality predictor. IEEE J Sel Topics Signal Process 11(1):206–220

    Article  Google Scholar 

  19. Ma K, Liu W, Zhang K, Duanmu Z, Wang Z, Zuo W (2018) End-to-end blind image quality assessment using deep neural networks. IEEE Trans Image Process 27(3):1202–1213

    Article  MathSciNet  Google Scholar 

  20. Mittal A, Soundararajan R, Bovik AC (2013) Making a completely blindimage quality analyzer. IEEE Signal Process Lett 20(3):209–212

    Article  Google Scholar 

  21. Xue W, Mou X, Zhang L, Bovik AC, Feng X (2014) Blind image quality assessment using joint statistics of gradient magnitude and laplacian features. IEEE Trans Image Process 23(11):4850–4862

    Article  MathSciNet  Google Scholar 

  22. Li Q, Lin W, Xu J, Fang Y (2016) Blind image quality assessment using statistical structural and luminance features. IEEE Trans Multim 18(12):2457–2469

    Article  Google Scholar 

  23. Feng P, Tang Z (2022) A survey of visual neural networks: current trends, challenges and opportunities. Multim Syst 29(2):693–724

    Article  MathSciNet  Google Scholar 

  24. Huang H, Zeng H, Tian Y, Chen J, Zhu J, Ma K-K (2020) Light field image quality assessment: An overview. In: Proceedings of the IEEE Conference on Multimedia Information Processing and Retrieval (MIPR 2020), pp. 348–353

  25. Niu Y, Zhong Y, Guo W, Shi Y, Chen P (2019) 2d and 3d image quality assessment: a survey of metrics and challenges. IEEE Access 7:782–801

    Article  Google Scholar 

  26. Yang X, Li F, Liu H (2021) Ttl-iqa: Transitive transfer learning based no-reference image quality assessment. IEEE Trans Multim 23:4326–4340

    Article  Google Scholar 

  27. Golestaneh SA, Dadsetan S, Kitani KM (2022) No-reference image quality assessment via transformers, relative ranking, and self-consistency. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2022), pp. 3989–3999

  28. Li F, Zhang Y, Cosman PC (2021) Mmmnet: An end-to-end multi-task deep convolution neural network with multi-scale and multi-hierarchy fusion for blind image quality assessment. IEEE Trans Circuits Syst Video Technol 31(12):4798–4811

    Article  Google Scholar 

  29. Ma J, Wu J, Li L, Dong W, Xie X, Shi G, Lin W (2021) Blind image quality assessment with active inference. IEEE Trans Image Process 30:3650–3663

    Article  Google Scholar 

  30. Pan Z, Yuan F, Lei J, Fang Y, Shao X, Kwong S (2022) Vcrnet: Visual compensation restoration network for no-reference image quality assessment. IEEE Trans Image Process 31:1613–1627

    Article  Google Scholar 

  31. Zhou Z, Xu Y, Quan Y, Xu R (2022) Deep blind image quality assessment using dual-order statistics. In: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME 2022), pp. 1–6

  32. Pan Z, Yuan F, Wang X, Xu L, Shao X, Kwong S (2023) No-reference image quality assessment via multibranch convolutional neural networks. IEEE Trans Artif Intell 4(1):148–160

    Article  Google Scholar 

  33. Tan M, Le QV (2019) Efficientnet: Rethinking model scaling for convolutional neural networks. In: Proceedings of the 36th International Conference on Machine Learning (ICML2019), pp. 10691–10700

  34. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015), pp. 1–14

  35. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018), pp. 4510–4520

  36. Howard A, Sandler M, Chen B, Wang W, Chen L-C, Tan M, Chu G, Vasudevan V, Zhu Y, Pang R, Adam H, Le Q (2019) Searching for mobilenetv3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2019), pp. 1314–1324

  37. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861

  38. Shi W, Caballero J, Huszár F, Totz J, Aitken AP, Bishop R, Rueckert D, Wang Z (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2016), pp. 1874–1883

  39. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Proceedings of the Medical Image Computing and Computer-Assisted Intervention (MICCAI 2015), pp. 234–241

  40. Yang S, Wu T, Shi S, Lao S, Gong Y, Cao M, Wang J, Yang Y (2022) Maniqa: Multi-dimension attention network for no-reference image quality assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1190–1199

  41. Sheikh HR, Sabir MF, Bovik AC (2006) A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans Image Process 15(11):3440–3451

    Article  Google Scholar 

  42. Larson EC, Chandler DM (2010) Most apparent distortion: full-reference image quality assessment and the role of strategy. J Electron Imag 19(1):011006

    Article  Google Scholar 

  43. Ponomarenko N, Ieremeiev O, Lukin V, Egiazarian K, Jin L, Astola J, Vozel B, Chehdi K, Carli M, Battisti F, Kuo C-CJ (2013) Color image database tid2013: Peculiarities and preliminary results. In: Proceedings of the European Workshop on Visual Information Processing (EUVIP 2013), pp. 106–111

  44. Lin H, Hosu V, Saupe D (2019) Kadid-10k: A large-scale artificially distorted iqa database. In: Proceedings of the 11th International Conference on Quality of Multimedia Experience (QoMEX 2019), pp. 1–3

  45. Hosu V, Lin H, Sziranyi T, Saupe D (2020) Koniq-10k: an ecologically valid database for deep learning of blind image quality assessment. IEEE Trans Image Process 29:4041–4056

    Article  Google Scholar 

  46. Su S, Yan Q, Zhu Y, Zhang C, Ge X, Sun J, Zhang Y (2020) Blindly assess image quality in the wild guided by a self-adaptive hyper network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), pp. 3664–3673

  47. Yu M, Tang Z, Zhang X, Zhong B, Zhang X (2022) Perceptual hashing with complementary color wavelet transform and compressed sensing for reduced-reference image quality assessment. IEEE Trans Circuits Syst Video Technol 32(11):7559–7574

    Article  Google Scholar 

  48. Ye P, Kumar J, Kang L, Doermann D (2012) Unsupervised feature learning framework for no-reference image quality assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2012), pp. 1098–1105

  49. Xu J, Ye P, Li Q, Du H, Liu Y, Doermann D (2016) Blind image quality assessment based on high order statistics aggregation. IEEE Trans Image Process 25(9):4444–4457

    Article  MathSciNet  Google Scholar 

  50. Bosse S, Maniry D, Müller K-R, Wiegand T, Samek W (2018) Deep neural networks for no-reference and full-reference image quality assessment. IEEE Trans Image Process 27(1):206–219

    Article  MathSciNet  Google Scholar 

  51. Zhang W, Ma K, Yan J, Deng D, Wang Z (2020) Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Trans Circuits Syst Video Technol 30(1):36–47

    Article  Google Scholar 

  52. Zhu H, Li L, Wu J, Dong W, Shi G (2020) Metaiqa: Deep meta-learning for no-reference image quality assessment. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), pp. 14131–14140

  53. Ying Z, Niu H, Gupta P, Mahajan D, Ghadiyaram D, Bovik A (2020) From patches to pictures (paq-2-piq): Mapping the perceptual space of picture quality. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), pp. 3572–3582

  54. Otroshi-Shahreza H, Amini A, Behroozi H (2018) No-reference image quality assessment using transfer learning. In: Proceedings of the 9th International Symposium on Telecommunications (IST 2018), pp. 637–640

  55. Dendi SVR, Dev C, Kothari N, Channappayya SS (2019) Generating image distortion maps using convolutional autoencoders with application to no reference image quality assessment. IEEE Signal Processing Letters 26(1):89–93

    Article  Google Scholar 

  56. Zhou Z, Li J, Xu Y, Quan Y (2020) Full-reference image quality metric for blurry images and compressed images using hybrid dictionary learning. Neural Comput Appl 32:12403–12415

    Article  Google Scholar 

  57. Tang Z, Chen Z, Li Z, Zhong B, Zhang X, Zhang X (2023) Unifying dual-attention and siamese transformer network for full-reference image quality assessment. ACM Trans Multim Comput Commun Appl 19(6):1–24

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their helpful comments and suggestions.

Funding

This work was partially supported by the Guangxi Natural Science Foundation (2022GXNSFAA035506), the Project of Guangxi Science and Technology (GuiKeAB23026040), the National Natural Science Foundation of China (62272111, 61962008, 62302108, 62062013), the Guangxi “Bagui Scholar” Team for Innovation and Research, the Guangxi Talent Highland Project of Big Data Intelligence and Application, the Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing, and the Innovation Project of Guangxi Graduate Education (YCSW2023131).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhenjun Tang.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tang, Z., Chen, Y., Chen, Z. et al. Lightweight transformer and multi-head prediction network for no-reference image quality assessment. Neural Comput & Applic 36, 1931–1946 (2024). https://doi.org/10.1007/s00521-023-09188-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-09188-3

Keywords

Navigation