skip to main content
10.1145/3615522.3615549acmotherconferencesArticle/Chapter ViewAbstractPublication PagesvinciConference Proceedingsconference-collections
research-article

Enhancing Visual Understanding by Removing Dithering with Global and Self-Conditioned Transformation

Published: 20 October 2023 Publication History

Abstract

PNG-8 images are commonly used on the web due to their small size, but their limited color palette often leads to dithering artifacts. Unfortunately, restoring these images using a conventional convolutional neural network (CNN) often results in suboptimal performance since the spatial distribution of dithering is not uniform across the image. This is because the convolutional operator is spatially consistent, meaning it applies the same kernel to all pixels, which we refer to as a global transformation. To address this issue, we propose PNG8IRNet, one approach that combines global and self-conditioned transformations to remove dithering artifacts. Our method incorporates a multilayer perceptron (MLP) to generate diverse kernels for each pixel, taking into account the spatial non-uniformity of dithering, which we define as a self-conditioned transformation. PNG8IRNet demonstrates its performance on multiple datasets, substantially enhancing visual comprehension through a comprehensive set of experiments.

References

[1]
Valentin Bazarevsky, Ivan Grishchenko, Karthik Raveendran, Tyler Zhu, Fan Zhang, and Matthias Grundmann. 2020. Blazepose: On-device real-time body pose tracking. arxiv:2006.10204
[2]
Amir Beck and Marc Teboulle. 2009. Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18, 11 (2009), 2419–2434.
[3]
Antoni Buades, Bartomeu Coll, and J-M Morel. 2005. A non-local algorithm for image denoising. In CVPR, Vol. 2. 60–65.
[4]
Chen Chen, Qifeng Chen, Jia Xu, and Vladlen Koltun. 2018. Learning to see in the dark. In CVPR. 3291–3300.
[5]
Dongdong Chen, Qingnan Fan, Jing Liao, Angelica Aviles-Rivero, Lu Yuan, Nenghai Yu, and Gang Hua. 2020. Controllable image processing via adaptive filterbank pyramid. IEEE Trans. Image Process. 29 (2020), 8043–8054.
[6]
Kostadin Dabov, Alessandro Foi, Vladimir Katkovnik, and Karen Egiazarian. 2007. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 16, 8 (2007), 2080–2095.
[7]
Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2015. Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38, 2 (2015), 295–307.
[8]
Chao Dong, Chen Change Loy, and Xiaoou Tang. 2016. Accelerating the super-resolution convolutional neural network. In ECCV. 391–407.
[9]
Michael Elad and Michal Aharon. 2006. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 15, 12 (2006), 3736–3745.
[10]
Qingnan Fan, Dongdong Chen, Lu Yuan, Gang Hua, Nenghai Yu, and Baoquan Chen. 2019. A general decoupled learning framework for parameterized image operators. IEEE Trans. Pattern Anal. Mach. Intell. 43, 1 (2019), 33–47.
[11]
Sina Farsiu, M Dirk Robinson, Michael Elad, and Peyman Milanfar. 2004. Fast and robust multiframe super resolution. IEEE Trans. Image Process. 13, 10 (2004), 1327–1344.
[12]
Xueyang Fu, Jiabin Huang, Delu Zeng, Yue Huang, Xinghao Ding, and John Paisley. 2017. Removing rain from single images via a deep detail network. In CVPR. 3855–3863.
[13]
Shu Fujita, Norishige Fukushima, Makoto Kimura, and Yutaka Ishibashi. 2015. Randomized redundant DCT: Efficient denoising by using random subsampling of DCT patches. In SIGGRAPH Asia 2015 Technical Briefs. 1–4.
[14]
Shi Guo, Zifei Yan, Kai Zhang, Wangmeng Zuo, and Lei Zhang. 2019. Toward convolutional blind denoising of real photographs. In CVPR. 1712–1722.
[15]
David Ha, Andrew Dai, and Quoc V Le. 2017. Hypernetworks. In ICLR.
[16]
Yifei Huang, Chenhui Li, Xiaohu Guo, Jing Liao, Chenxu Zhang, and Changbo Wang. 2020. Desmoothgan: Recovering details of smoothed images via spatial feature-wise transformation and full attention. In ACM International Conference on Multimedia. 2655–2663.
[17]
Xu Jia, Bert De Brabandere, Tinne Tuytelaars, and Luc V Gool. 2016. Dynamic filter networks. In NeurIPS. 667–675.
[18]
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In ECCV. 694–711.
[19]
Diederik P Kingma and Jimmy Ba. 2015. Adam: A Method for stochastic optimization. In ICLR.
[20]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2017. Imagenet classification with deep convolutional neural networks. Commun. ACM 60, 6 (2017), 84–90.
[21]
Chuan Li and Michael Wand. 2016. Combining markov random fields and convolutional neural networks for image synthesis. In CVPR. 2479–2486.
[22]
Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, and Qifeng Chen. 2021. Involution: Inverting the inherence of convolution for visual recognition. In CVPR. 12321–12330.
[23]
Duo Li, Anbang Yao, and Qifeng Chen. 2020. Learning to learn parameterized classification networks for scalable input images. In ECCV. 19–35.
[24]
Aravindh Mahendran and Andrea Vedaldi. 2015. Understanding deep image representations by inverting them. In CVPR. 5188–5196.
[25]
Julien Mairal, Francis Bach, Jean Ponce, Guillermo Sapiro, and Andrew Zisserman. 2009. Non-local sparse models for image restoration. In ICCV. 2272–2279.
[26]
Julien Mairal, Michael Elad, and Guillermo Sapiro. 2007. Sparse representation for color image restoration. IEEE Trans. Image Process. 17, 1 (2007), 53–69.
[27]
Ymir Mäkinen, Lucio Azzari, and Alessandro Foi. 2019. Exact transform-domain noise variance for collaborative filtering of stationary correlated noise. In ICIP. 185–189.
[28]
Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, and Alexei A Efros. 2016. Context encoders: Feature learning by inpainting. In CVPR. 2536–2544.
[29]
Jorge Poco, Angela Mayhua, and Jeffrey Heer. 2017. Extracting and retargeting color mappings from bitmap images of visualizations. IEEE Trans. Vis. Comput. Graph. 24, 1 (2017), 637–646.
[30]
Wenqi Ren, Jinshan Pan, Hua Zhang, Xiaochun Cao, and Ming-Hsuan Yang. [n. d.]. Single image dehazing via multi-scale convolutional neural networks with holistic edges. Int. J. Comput. Vis. 128 ([n. d.]), 240–259.
[31]
Leonid I Rudin, Stanley Osher, and Emad Fatemi. 1992. Nonlinear total variation based noise removal algorithms. Physica D: nonlinear phenomena 60, 1-4 (1992), 259–268.
[32]
Sicheng Song, Juntong Chen, Chenhui Li, and Changbo Wang. 2023. Gvqa: Learning to answer questions about graphs with visualizations via knowledge base. In CHI. 1–16.
[33]
Sicheng Song, Chenhui Li, Dong Li, Juntong Chen, and Changbo Wang. 2022. Graphdecoder: Recovering diverse network graphs from visualization images via attention-aware learning. IEEE Trans. Vis. Comput. Graph. (2022), 1–17.
[34]
Sicheng Song, Chenhui Li, Yujing Sun, and Changbo Wang. 2023. Vividgraph: Learning to extract and redesign network graphs from visualization images. IEEE Trans. Vis. Comput. Graph. 29, 7 (2023), 3169–3181.
[35]
Robert L Stevenson. 1997. Inverse halftoning via MAP estimation. IEEE Trans. Image Process. 6, 4 (1997), 574–583.
[36]
Masanori Suganuma, Xing Liu, and Takayuki Okatani. 2019. Attention-based adaptive selection of operations for image restoration in the presence of unknown combined distortions. In CVPR. 9039–9048.
[37]
Radu Timofte, Eirikur Agustsson, Luc Van Gool, Ming-Hsuan Yang, Lei Zhang, Bee Lim, 2017. NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results. In IEEE Conference on Computer Vision and Pattern Recognition Workshops.
[38]
Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2020. Deep image prior. Int. J. Comput. Vis. 128, 7 (01 Jul 2020), 1867–1888.
[39]
Manisha Verma, Sudhakar Kumawat, Yuta Nakashima, and Shanmuganathan Raman. 2020. Yoga-82: A new dataset for fine-grained classification of human poses. In CVPRW. 4472–4479.
[40]
Ziyu Wan, Bo Zhang, Dongdong Chen, Pan Zhang, Dong Chen, Jing Liao, and Fang Wen. 2020. Bringing old photos back to life. In CVPR. 2747–2757.
[41]
Menghan Xia, Wenbo Hu, Xueting Liu, and Tien-Tsin Wong. 2021. Deep halftoning with reversible binary pattern. In ICCV. 14000–14009.
[42]
Junyuan Xie, Linli Xu, and Enhong Chen. 2012. Image denoising and inpainting with deep neural networks. In NeurIPS, Vol. 25. 341–349.
[43]
Jun Xu, Lei Zhang, and David Zhang. 2018. A trilateral weighted sparse coding scheme for real-world image denoising. In ECCV. 20–36.
[44]
Li Xu, Jimmy S Ren, Ce Liu, and Jiaya Jia. 2014. Deep convolutional neural network for image deconvolution. In NeurIPS, Vol. 27. 1790–1798.
[45]
Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, and Cordelia Schmid. 2021. Just ask: learning to answer questions from millions of narrated videos. In ICCV. 1686–1697.
[46]
Jianchao Yang, John Wright, Thomas S Huang, and Yi Ma. 2010. Image super-resolution via sparse representation. IEEE Trans. Image Process. 19, 11 (2010), 2861–2873.
[47]
Ke Yu, Chao Dong, Liang Lin, and Chen Change Loy. 2018. Crafting a toolchain for image restoration by deep reinforcement learning. In CVPR. 2443–2452.
[48]
Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. 2017. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Trans. Image Process. 26, 7 (2017), 3142–3155.
[49]
Kai Zhang, Wangmeng Zuo, and Lei Zhang. 2018. FFDNet: Toward a fast and flexible solution for CNN-based image denoising. IEEE Trans. Image Process. 27, 9 (2018), 4608–4622.
[50]
Yang Zhao, Ronggang Wang, Wei Jia, Wangmeng Zuo, Xiaoping Liu, and Wen Gao. 2019. Deep reconstruction of least significant bits for bit-depth expansion. IEEE Trans. Image Process. 28, 6 (2019), 2847–2859.

Index Terms

  1. Enhancing Visual Understanding by Removing Dithering with Global and Self-Conditioned Transformation

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      VINCI '23: Proceedings of the 16th International Symposium on Visual Information Communication and Interaction
      September 2023
      308 pages
      ISBN:9798400707513
      DOI:10.1145/3615522
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 October 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. dithering removing
      2. neural networks
      3. spatial non-uniformity
      4. visual understanding

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      Conference

      VINCI 2023

      Acceptance Rates

      Overall Acceptance Rate 71 of 193 submissions, 37%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 29
        Total Downloads
      • Downloads (Last 12 months)16
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 17 Jan 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media