skip to main content
10.1145/3581783.3613775acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

A Model-Agnostic Semantic-Quality Compatible Framework based on Self-Supervised Semantic Decoupling

Published: 27 October 2023 Publication History

Abstract

Blind Image Quality Assessment (BIQA) is a challenging research topic that is critical for preprocessing and optimizing downstream vision tasks such as semantic recognition and image restoration. However, there has been a significant disconnect between BIQA research and other vision tasks. The primary cause of such disconnect is the incompatibility of existing BIQA models with other vision tasks, resulting in significant computational complexity. To address this issue, we propose a model-agnostic semantic-quality compatible framework that can simultaneously generate quality and semantic predictions. By incorporating a lightweight learning architecture, we demonstrate that a parameter-fixed semantic-oriented backbone can predict the perceptual quality of images as accurately as models trained end-to-end. We systematically study the major components of our framework, and our experimental results demonstrate the superiority of our model in terms of both complexity and accuracy. The source code of this work is available at https://github.com/MaxiaoyuHehe/SQCFNet.

References

[1]
Anna C Belkina, Christopher O Ciccolella, Rina Anno, Richard Halpert, Josef Spidlen, and Jennifer E Snyder-Cappione. 2019. Automated optimized parameters for T-distributed stochastic neighbor embedding improve visualization and analysis of large datasets. Nature communications, Vol. 10, 1 (2019), 5415.
[2]
Simone Bianco, Luigi Celona, Paolo Napoletano, and Raimondo Schettini. 2018. On the use of deep learning for blind image quality assessment. Signal, Image and Video Processing, Vol. 12, 2 (2018), 355--362.
[3]
Liyana Adilla binti Burhanuddin, Xiaonan Liu, Yansha Deng, Ursula Challita, and András Zahemszky. 2022. QoE optimization for live video streaming in UAV-to-UAV communications via deep reinforcement learning. IEEE Transactions on Vehicular Technology, Vol. 71, 5 (2022), 5358--5370.
[4]
Sebastian Bosse, Dominique Maniry, Klaus-Robert Müller, Thomas Wiegand, and Wojciech Samek. 2017. Deep neural networks for no-reference and full-reference image quality assessment. IEEE Transactions on image processing, Vol. 27, 1 (2017), 206--219.
[5]
Honggang Chen, Xiaohai He, Linbo Qing, Yuanyuan Wu, Chao Ren, Ray E Sheriff, and Ce Zhu. 2022a. Real-world single image super-resolution: A brief review. Information Fusion, Vol. 79 (2022), 124--145.
[6]
Zewen Chen, Juan Wang, Bing Li, Chunfeng Yuan, Weihua Xiong, Rui Cheng, and Weiming Hu. 2022b. Teacher-Guided Learning for Blind Image Quality Assessment. In Proceedings of the Asian Conference on Computer Vision. 2457--2474.
[7]
Tai-Yin Chiu, Yinan Zhao, and Danna Gurari. 2020. Assessing image quality issues for real-world problems. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3646--3656.
[8]
Jifeng Dai, Yi Li, Kaiming He, and Jian Sun. 2016. R-fcn: Object detection via region-based fully convolutional networks. Advances in neural information processing systems, Vol. 29 (2016).
[9]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
[10]
Hilmi E Egilmez, Seyhan Civanlar, and A Murat Tekalp. 2012. An optimization framework for QoS-enabled adaptive video streaming over OpenFlow networks. IEEE Transactions on Multimedia, Vol. 15, 3 (2012), 710--715.
[11]
Yuming Fang, Hanwei Zhu, Yan Zeng, Kede Ma, and Zhou Wang. 2020. Perceptual quality assessment of smartphone photography. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3677--3686.
[12]
Deepti Ghadiyaram and Alan C Bovik. 2015. Massive online crowdsourced study of subjective and objective picture quality. IEEE Transactions on Image Processing, Vol. 25, 1 (2015), 372--387.
[13]
SeyedAlireza Golestaneh and Lina J Karam. 2016. Reduced-reference quality assessment based on the entropy of DWT coefficients of locally weighted gradient magnitudes. IEEE Transactions on image processing, Vol. 25, 11 (2016), 5293--5303.
[14]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[15]
Vlad Hosu, Hanhe Lin, Tamas Sziranyi, and Dietmar Saupe. 2020. KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE Transactions on Image Processing, Vol. 29 (2020), 4041--4056.
[16]
Le Kang, Peng Ye, Yi Li, and David Doermann. 2014. Convolutional neural networks for no-reference image quality assessment. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1733--1740.
[17]
Junjie Ke, Qifei Wang, Yilin Wang, Peyman Milanfar, and Feng Yang. 2021. Musiq: Multi-scale image quality transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5148--5157.
[18]
Zewen Li, Fan Liu, Wenjie Yang, Shouheng Peng, and Jun Zhou. 2021. A survey of convolutional neural networks: analysis, applications, and prospects. IEEE transactions on neural networks and learning systems (2021).
[19]
Hanhe Lin, Vlad Hosu, and Dietmar Saupe. 2019. KADID-10k: A large-scale artificially distorted IQA database. In 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX). IEEE, 1--3.
[20]
Hanhe Lin, Vlad Hosu, and Dietmar Saupe. 2020. DeepFL-IQA: Weak supervision for deep IQA feature learning. arXiv preprint arXiv:2001.08113 (2020).
[21]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Computer Vision-ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, 740--755.
[22]
Jianzhao Liu, Xin Li, Yanding Peng, Tao Yu, and Zhibo Chen. 2022. Swiniqa: Learned swin distance for compressed image quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1795--1799.
[23]
Xialei Liu, Joost Van De Weijer, and Andrew D Bagdanov. 2017a. Rankiqa: Learning from rankings for no-reference image quality assessment. In Proceedings of the IEEE international conference on computer vision. 1040--1049.
[24]
Yutao Liu, Guangtao Zhai, Ke Gu, Xianming Liu, Debin Zhao, and Wen Gao. 2017b. Reduced-reference image quality assessment in free-energy principle and sparse representation. IEEE Transactions on Multimedia, Vol. 20, 2 (2017), 379--391.
[25]
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision. 10012--10022.
[26]
Kede Ma, Zhengfang Duanmu, Qingbo Wu, Zhou Wang, Hongwei Yong, Hongliang Li, and Lei Zhang. 2016a. Waterloo exploration database: New challenges for image quality assessment models. IEEE Transactions on Image Processing, Vol. 26, 2 (2016), 1004--1016.
[27]
Kede Ma, Qingbo Wu, Zhou Wang, Zhengfang Duanmu, Hongwei Yong, Hongliang Li, and Lei Zhang. 2016b. Group mad competition-a new methodology to compare objective image quality models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1664--1673.
[28]
Long Ma, Tengyu Ma, Risheng Liu, Xin Fan, and Zhongxuan Luo. 2022a. Toward fast, flexible, and robust low-light image enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5637--5646.
[29]
Xiaoyu Ma, Yaqi Wang, Chang Liu, Suiyu Zhang, and Dingguo Yu. 2022b. ADGNet: Attention Discrepancy Guided Deep Neural Network for Blind Image Quality Assessment. In Proceedings of the 30th ACM International Conference on Multimedia. 1309--1318.
[30]
Anish Mittal, Anush Krishna Moorthy, and Alan Conrad Bovik. 2012. No-reference image quality assessment in the spatial domain. IEEE Transactions on image processing, Vol. 21, 12 (2012), 4695--4708.
[31]
Da Pan, Ping Shi, Ming Hou, Zefeng Ying, Sizhe Fu, and Yuan Zhang. 2018. Blind predicting similar quality map for image quality assessment. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6373--6382.
[32]
Tal Ridnik, Emanuel Ben-Baruch, Asaf Noy, and Lihi Zelnik-Manor. 2021. Imagenet-21k pretraining for the masses. arXiv preprint arXiv:2104.10972 (2021).
[33]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[34]
Shaolin Su, Qingsen Yan, Yu Zhu, Cheng Zhang, Xin Ge, Jinqiu Sun, and Yanning Zhang. 2020. Blindly assess image quality in the wild guided by a self-adaptive hyper network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3667--3676.
[35]
Jing Wang, Haotian Fan, Xiaoxia Hou, Yitian Xu, Tao Li, Xuechao Lu, and Lean Fu. 2022. MSTRIQ: No Reference Image Quality Assessment Based on Swin Transformer with Multi-Stage Fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1269--1278.
[36]
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, Vol. 13, 4 (2004), 600--612.
[37]
Jingtao Xu, Peng Ye, Qiaohong Li, Haiqing Du, Yong Liu, and David Doermann. 2016. Blind image quality assessment based on high order statistics aggregation. IEEE Transactions on Image Processing, Vol. 25, 9 (2016), 4444--4457.
[38]
Sheng Yang, Qiuping Jiang, Weisi Lin, and Yongtao Wang. 2019. Sgdnet: An end-to-end saliency-guided deep neural network for no-reference image quality assessment. In Proceedings of the 27th ACM International Conference on Multimedia. 1383--1391.
[39]
Sidi Yang, Tianhe Wu, Shuwei Shi, Shanshan Lao, Yuan Gong, Mingdeng Cao, Jiahao Wang, and Yujiu Yang. 2022. Maniqa: Multi-dimension attention network for no-reference image quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1191--1200.
[40]
Zhenqiang Ying, Haoran Niu, Praful Gupta, Dhruv Mahajan, Deepti Ghadiyaram, and Alan Bovik. 2020. From patches to pictures (PaQ-2-PiQ): Mapping the perceptual space of picture quality. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3575--3585.
[41]
Junyong You and Jari Korhonen. 2021. Transformer for image quality assessment. In 2021 IEEE international conference on image processing (ICIP). IEEE, 1389--1393.
[42]
Lin Zhang, Lei Zhang, and Alan C Bovik. 2015. A feature-enriched completely blind image quality evaluator. IEEE Transactions on Image Processing, Vol. 24, 8 (2015), 2579--2591.
[43]
Weixia Zhang, Kede Ma, Jia Yan, Dexiang Deng, and Zhou Wang. 2018. Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 30, 1 (2018), 36--47.

Index Terms

  1. A Model-Agnostic Semantic-Quality Compatible Framework based on Self-Supervised Semantic Decoupling

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MM '23: Proceedings of the 31st ACM International Conference on Multimedia
      October 2023
      9913 pages
      ISBN:9798400701085
      DOI:10.1145/3581783
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 27 October 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. blind image quality assessment
      2. semantic recognition
      3. semantic-quality compatible framework

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      MM '23
      Sponsor:
      MM '23: The 31st ACM International Conference on Multimedia
      October 29 - November 3, 2023
      Ottawa ON, Canada

      Acceptance Rates

      Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 94
        Total Downloads
      • Downloads (Last 12 months)41
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 20 Feb 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media