research-article

Enhancing Visual Understanding by Removing Dithering with Global and Self-Conditioned Transformation

Authors:

Changbo WangAuthors Info & Claims

VINCI '23: Proceedings of the 16th International Symposium on Visual Information Communication and Interaction

Article No.: 27, Pages 1 - 8

https://doi.org/10.1145/3615522.3615549

Published: 20 October 2023 Publication History

Abstract

PNG-8 images are commonly used on the web due to their small size, but their limited color palette often leads to dithering artifacts. Unfortunately, restoring these images using a conventional convolutional neural network (CNN) often results in suboptimal performance since the spatial distribution of dithering is not uniform across the image. This is because the convolutional operator is spatially consistent, meaning it applies the same kernel to all pixels, which we refer to as a global transformation. To address this issue, we propose PNG8IRNet, one approach that combines global and self-conditioned transformations to remove dithering artifacts. Our method incorporates a multilayer perceptron (MLP) to generate diverse kernels for each pixel, taking into account the spatial non-uniformity of dithering, which we define as a self-conditioned transformation. PNG8IRNet demonstrates its performance on multiple datasets, substantially enhancing visual comprehension through a comprehensive set of experiments.

References

[1]

Valentin Bazarevsky, Ivan Grishchenko, Karthik Raveendran, Tyler Zhu, Fan Zhang, and Matthias Grundmann. 2020. Blazepose: On-device real-time body pose tracking. arxiv:2006.10204

[2]

Amir Beck and Marc Teboulle. 2009. Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18, 11 (2009), 2419–2434.

Digital Library

[3]

Antoni Buades, Bartomeu Coll, and J-M Morel. 2005. A non-local algorithm for image denoising. In CVPR, Vol. 2. 60–65.

[4]

Chen Chen, Qifeng Chen, Jia Xu, and Vladlen Koltun. 2018. Learning to see in the dark. In CVPR. 3291–3300.

[5]

Dongdong Chen, Qingnan Fan, Jing Liao, Angelica Aviles-Rivero, Lu Yuan, Nenghai Yu, and Gang Hua. 2020. Controllable image processing via adaptive filterbank pyramid. IEEE Trans. Image Process. 29 (2020), 8043–8054.

Digital Library

[6]

Kostadin Dabov, Alessandro Foi, Vladimir Katkovnik, and Karen Egiazarian. 2007. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 16, 8 (2007), 2080–2095.

[7]

Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2015. Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38, 2 (2015), 295–307.

Digital Library

[8]

Chao Dong, Chen Change Loy, and Xiaoou Tang. 2016. Accelerating the super-resolution convolutional neural network. In ECCV. 391–407.

[9]

Michael Elad and Michal Aharon. 2006. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 15, 12 (2006), 3736–3745.

Digital Library

[10]

Qingnan Fan, Dongdong Chen, Lu Yuan, Gang Hua, Nenghai Yu, and Baoquan Chen. 2019. A general decoupled learning framework for parameterized image operators. IEEE Trans. Pattern Anal. Mach. Intell. 43, 1 (2019), 33–47.

Digital Library

[11]

Sina Farsiu, M Dirk Robinson, Michael Elad, and Peyman Milanfar. 2004. Fast and robust multiframe super resolution. IEEE Trans. Image Process. 13, 10 (2004), 1327–1344.

Digital Library

[12]

Xueyang Fu, Jiabin Huang, Delu Zeng, Yue Huang, Xinghao Ding, and John Paisley. 2017. Removing rain from single images via a deep detail network. In CVPR. 3855–3863.

[13]

Shu Fujita, Norishige Fukushima, Makoto Kimura, and Yutaka Ishibashi. 2015. Randomized redundant DCT: Efficient denoising by using random subsampling of DCT patches. In SIGGRAPH Asia 2015 Technical Briefs. 1–4.

[14]

Shi Guo, Zifei Yan, Kai Zhang, Wangmeng Zuo, and Lei Zhang. 2019. Toward convolutional blind denoising of real photographs. In CVPR. 1712–1722.

[15]

David Ha, Andrew Dai, and Quoc V Le. 2017. Hypernetworks. In ICLR.

[16]

Yifei Huang, Chenhui Li, Xiaohu Guo, Jing Liao, Chenxu Zhang, and Changbo Wang. 2020. Desmoothgan: Recovering details of smoothed images via spatial feature-wise transformation and full attention. In ACM International Conference on Multimedia. 2655–2663.

Digital Library

[17]

Xu Jia, Bert De Brabandere, Tinne Tuytelaars, and Luc V Gool. 2016. Dynamic filter networks. In NeurIPS. 667–675.

[18]

Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In ECCV. 694–711.

[19]

Diederik P Kingma and Jimmy Ba. 2015. Adam: A Method for stochastic optimization. In ICLR.

[20]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2017. Imagenet classification with deep convolutional neural networks. Commun. ACM 60, 6 (2017), 84–90.

Digital Library

[21]

Chuan Li and Michael Wand. 2016. Combining markov random fields and convolutional neural networks for image synthesis. In CVPR. 2479–2486.

[22]

Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, and Qifeng Chen. 2021. Involution: Inverting the inherence of convolution for visual recognition. In CVPR. 12321–12330.

[23]

Duo Li, Anbang Yao, and Qifeng Chen. 2020. Learning to learn parameterized classification networks for scalable input images. In ECCV. 19–35.

[24]

Aravindh Mahendran and Andrea Vedaldi. 2015. Understanding deep image representations by inverting them. In CVPR. 5188–5196.

[25]

Julien Mairal, Francis Bach, Jean Ponce, Guillermo Sapiro, and Andrew Zisserman. 2009. Non-local sparse models for image restoration. In ICCV. 2272–2279.

[26]

Julien Mairal, Michael Elad, and Guillermo Sapiro. 2007. Sparse representation for color image restoration. IEEE Trans. Image Process. 17, 1 (2007), 53–69.

Digital Library

[27]

Ymir Mäkinen, Lucio Azzari, and Alessandro Foi. 2019. Exact transform-domain noise variance for collaborative filtering of stationary correlated noise. In ICIP. 185–189.

[28]

Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, and Alexei A Efros. 2016. Context encoders: Feature learning by inpainting. In CVPR. 2536–2544.

[29]

Jorge Poco, Angela Mayhua, and Jeffrey Heer. 2017. Extracting and retargeting color mappings from bitmap images of visualizations. IEEE Trans. Vis. Comput. Graph. 24, 1 (2017), 637–646.

[30]

Wenqi Ren, Jinshan Pan, Hua Zhang, Xiaochun Cao, and Ming-Hsuan Yang. [n. d.]. Single image dehazing via multi-scale convolutional neural networks with holistic edges. Int. J. Comput. Vis. 128 ([n. d.]), 240–259.

[31]

Leonid I Rudin, Stanley Osher, and Emad Fatemi. 1992. Nonlinear total variation based noise removal algorithms. Physica D: nonlinear phenomena 60, 1-4 (1992), 259–268.

[32]

Sicheng Song, Juntong Chen, Chenhui Li, and Changbo Wang. 2023. Gvqa: Learning to answer questions about graphs with visualizations via knowledge base. In CHI. 1–16.

[33]

Sicheng Song, Chenhui Li, Dong Li, Juntong Chen, and Changbo Wang. 2022. Graphdecoder: Recovering diverse network graphs from visualization images via attention-aware learning. IEEE Trans. Vis. Comput. Graph. (2022), 1–17.

[34]

Sicheng Song, Chenhui Li, Yujing Sun, and Changbo Wang. 2023. Vividgraph: Learning to extract and redesign network graphs from visualization images. IEEE Trans. Vis. Comput. Graph. 29, 7 (2023), 3169–3181.

Digital Library

[35]

Robert L Stevenson. 1997. Inverse halftoning via MAP estimation. IEEE Trans. Image Process. 6, 4 (1997), 574–583.

Digital Library

[36]

Masanori Suganuma, Xing Liu, and Takayuki Okatani. 2019. Attention-based adaptive selection of operations for image restoration in the presence of unknown combined distortions. In CVPR. 9039–9048.

[37]

Radu Timofte, Eirikur Agustsson, Luc Van Gool, Ming-Hsuan Yang, Lei Zhang, Bee Lim, 2017. NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results. In IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[38]

Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2020. Deep image prior. Int. J. Comput. Vis. 128, 7 (01 Jul 2020), 1867–1888.

[39]

Manisha Verma, Sudhakar Kumawat, Yuta Nakashima, and Shanmuganathan Raman. 2020. Yoga-82: A new dataset for fine-grained classification of human poses. In CVPRW. 4472–4479.

[40]

Ziyu Wan, Bo Zhang, Dongdong Chen, Pan Zhang, Dong Chen, Jing Liao, and Fang Wen. 2020. Bringing old photos back to life. In CVPR. 2747–2757.

[41]

Menghan Xia, Wenbo Hu, Xueting Liu, and Tien-Tsin Wong. 2021. Deep halftoning with reversible binary pattern. In ICCV. 14000–14009.

[42]

Junyuan Xie, Linli Xu, and Enhong Chen. 2012. Image denoising and inpainting with deep neural networks. In NeurIPS, Vol. 25. 341–349.

[43]

Jun Xu, Lei Zhang, and David Zhang. 2018. A trilateral weighted sparse coding scheme for real-world image denoising. In ECCV. 20–36.

[44]

Li Xu, Jimmy S Ren, Ce Liu, and Jiaya Jia. 2014. Deep convolutional neural network for image deconvolution. In NeurIPS, Vol. 27. 1790–1798.

[45]

Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, and Cordelia Schmid. 2021. Just ask: learning to answer questions from millions of narrated videos. In ICCV. 1686–1697.

[46]

Jianchao Yang, John Wright, Thomas S Huang, and Yi Ma. 2010. Image super-resolution via sparse representation. IEEE Trans. Image Process. 19, 11 (2010), 2861–2873.

Digital Library

[47]

Ke Yu, Chao Dong, Liang Lin, and Chen Change Loy. 2018. Crafting a toolchain for image restoration by deep reinforcement learning. In CVPR. 2443–2452.

[48]

Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. 2017. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Trans. Image Process. 26, 7 (2017), 3142–3155.

Digital Library

[49]

Kai Zhang, Wangmeng Zuo, and Lei Zhang. 2018. FFDNet: Toward a fast and flexible solution for CNN-based image denoising. IEEE Trans. Image Process. 27, 9 (2018), 4608–4622.

[50]

Yang Zhao, Ronggang Wang, Wei Jia, Wangmeng Zuo, Xiaoping Liu, and Wen Gao. 2019. Deep reconstruction of least significant bits for bit-depth expansion. IEEE Trans. Image Process. 28, 6 (2019), 2847–2859.

Index Terms

Enhancing Visual Understanding by Removing Dithering with Global and Self-Conditioned Transformation
1. Computing methodologies
  1. Computer graphics
    1. Image manipulation
      1. Image processing
2. Human-centered computing
  1. Visualization
    1. Visualization systems and tools
      1. Visualization toolkits

Recommendations

Color reduction and estimation of the number of dominant colors by using a self-growing and self-organized neural gas

A new method for color reduction in a digital image is proposed, which is based on the development of a new neural network classifier and on a new method for Estimation of the Most Important Classes (EMIC). The proposed neural network combines the ...
Color discrimination enhancement for dichromats using self-organizing color transformation

Color deficient persons, especially dichromats, have difficulty in discriminating certain kinds of colors. To help dichromats discriminate colors better, a color transformation method is proposed. The method utilizes the redundancy of color information, ...
Fuzzy algorithms for combined quantization and dithering

Color quantization reduces the number of the colors in a color image, while the subsequent dithering operation attempts to create the illusion of more colors with this reduced palette. In quantization, the palette is designed to minimize the mean ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

VINCI '23: Proceedings of the 16th International Symposium on Visual Information Communication and Interaction

September 2023

308 pages

ISBN:9798400707513

DOI:10.1145/3615522

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Shanghai Committee of Science and Technology
National Natural Science Foundation of China

Conference

VINCI 2023

VINCI 2023: The 16th International Symposium on Visual Information Communication and Interaction

September 22 - 24, 2023

Guangzhou, China

Acceptance Rates

Overall Acceptance Rate 71 of 193 submissions, 37%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
29
Total Downloads

Downloads (Last 12 months)16
Downloads (Last 6 weeks)3

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents