research-article

Stacked Semantically-Guided Learning for Image De-distortion

Authors:
Huiyuan Fu

Beijing University of Posts and Telecommunications, Beijing, China

Beijing University of Posts and Telecommunications, Beijing, China
View Profile

,
Changhao Tian

Beijing University of Posts and Telecommunications, Beijing, China

Beijing University of Posts and Telecommunications, Beijing, China
View Profile

,
Xin Wang

Stony Brook University, New York, NY, USA

Stony Brook University, New York, NY, USA
View Profile

,
Huadong Ma

Beijing University of Posts and Telecommunications, Beijing, China

Beijing University of Posts and Telecommunications, Beijing, China
View Profile

MM '21: Proceedings of the 29th ACM International Conference on MultimediaOctober 2021Pages 4519–4527https://doi.org/10.1145/3474085.3475608

Published:17 October 2021Publication History

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Pages 4519–4527

ABSTRACT

Image de-distortion is very important because distortions will degrade the image quality significantly. It can benefit many computational visual media applications that are primarily designed for high-quality images. In order to address this challenging issue, we propose a stacked semantically-guided network, which is the first try on this task. It can capture and restore the distortions around the humans and the adjacent background effectively with the stacked network architecture and the semantically-guided scheme. In addition, a discriminative restoration loss function is proposed to recover different distorted regions in the images discriminatively. As another important effort, we construct a large-scale dataset for image de-distortion. Extensive qualitative and quantitative experiments show that our proposed method achieves a superior performance compared with the state-of-the-art approaches.

References

Andrew Brock, Jeff Donahue, and Karen Simonyan. 2018. Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018).Google Scholar
Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2019. Open Pose: realtime multi-person 2D pose estimation using Part Affinity Fields. IEEE transactions on pattern analysis and machine intelligence43, 1 (2019), 172--186.Google Scholar
Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8789--8797.Google ScholarCross Ref
Xin Deng and Pier Luigi Dragotti. 2020. Deep convolutional neural network for multi-modal image restoration and fusion. IEEE transactions on pattern analysis and machine intelligence (2020).Google Scholar
Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang, and Hanqing Lu. 2019. Dual attention network for scene segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3146--3154.Google ScholarCross Ref
Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial networks.arXiv preprint arXiv:1406.2661 (2014). Google ScholarDigital Library
Yuanbiao Gou, Boyun Li, Zitao Liu, Songfan Yang, and Xi Peng. 2020. CLEARER: Multi-Scale Neural Architecture Search for Image Restoration.Advances in Neural Information Processing Systems 33 (2020).Google Scholar
Yong Guo, Jian Chen, Jingdong Wang, Qi Chen, Jiezhang Cao, Zeshuai Deng,Yanwu Xu, and Mingkui Tan. 2020. Closed-loop matters: Dual regression networks for single image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5407--5416.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual Learning for Image Recognition. arXiv:1512.03385 [cs.CV]Google Scholar
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1125--1134.Google ScholarCross Ref
Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4401--4410.Google ScholarCross Ref
Yun Liu, Ming-Ming Cheng, Xiaowei Hu, Kai Wang, and Xiang Bai. 2017. Richer convolutional features for edge detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3000--3009.Google ScholarCross Ref
Jialun Peng, Dong Liu, Songcen Xu, and Houqiang Li. 2021. Generating Diverse Structure for Image In painting With Hierarchical VQ-VAE. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
Xuebin Qin, Zichen Zhang, Chenyang Huang, Masood Dehghan, Osmar R Zaiane, and Martin Jagersand. 2020. U2-Net: Going deeper with nested U-structurefor salient object detection. Pattern Recognition 106 (2020), 107404.Google ScholarCross Ref
Scott Schaefer, Travis McPhail, and Joe Warren. 2006. Image Deformation Using Moving Least Squares. In ACM SIGGRAPH 2006 Papers (Boston, Massachusetts) (SIGGRAPH '06). Association for Computing Machinery, New York, NY, USA, 533--540. https://doi.org/10.1145/1179352.1141920 Google ScholarDigital Library
Maximilian Seitzer. 2020. pytorch-fid: FID Score for PyTorch. https://github.com/mseitzer/pytorch-fid. Version 0.1.1.Google Scholar
Jingdong Wang, Ke Sun, Tianheng Cheng, Borui Jiang, Chaorui Deng, Yang Zhao, Dong Liu, Yadong Mu, Mingkui Tan, Xinggang Wang, et al. 2020. Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence (2020).Google Scholar
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. High-resolution image synthesis and semantic manipulation with conditional gans. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8798--8807.Google ScholarCross Ref
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Imagequality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600--612. Google ScholarDigital Library
Fuzhi Yang, Huan Yang, Jianlong Fu, Hongtao Lu, and Baining Guo. 2020. Learning texture transformer network for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5791--5800.Google ScholarCross Ref
Zili Yi, Qiang Tang, Shekoofeh Azizi, Daesik Jang, and Zhan Xu. 2020. Contextual residual aggregation for ultra high-resolution image in painting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7508--7517.Google ScholarCross Ref
Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Lei Zhao, Qihang Mo, Sihuan Lin, Zhizhong Wang, Zhiwen Zuo, Haibo Chen, Wei Xing, and Dongming Lu. 2020. UCTGAN: Diverse Image In painting Based on Unsupervised Cross-Space Translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5741--5750.Google ScholarCross Ref
Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A Efros,Oliver Wang, and Eli Shechtman. 2017. Toward multimodal image-to-image translation. arXiv preprint arXiv:1711.11586 (2017). Google ScholarDigital Library

Index Terms

Stacked Semantically-Guided Learning for Image De-distortion
1. Computing methodologies
  1. Computer graphics
    1. Image manipulation
      1. Image processing
  2. Machine learning
    1. Machine learning algorithms
    2. Machine learning approaches
      1. Neural networks

Recommendations

Perceptual Image Dehazing Based on Generative Adversarial Learning
Advances in Multimedia Information Processing – PCM 2018
Abstract
Convolutional Neural Networks (CNN) based single image dehazing methods have recently gained much attention. However, as they heavily rely on synthetic haze images, existing CNN-based dehazing methods have limitations in achieving visually ...
Read More
RCRN: Real-world Character Image Restoration Network via Skeleton Extraction
MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Constructing high-quality character image datasets is challenging because real-world images are often affected by image degradation. There are limitations when applying current image restoration methods to such real-world character images, since (i) the ...
Read More
Overwater Image Dehazing via Cycle-Consistent Generative Adversarial Network
Computer Vision – ACCV 2020
Abstract
In contrast to images taken on land scenes, images taken over water are more prone to degradation due to the influence of the haze. However, existing image dehazing methods are mainly developed for land scenes and perform poorly when applied to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '21: Proceedings of the 29th ACM International Conference on Multimedia
October 2021
5796 pages
ISBN:9781450386517
DOI:10.1145/3474085
General Chairs:
Heng Tao Shen
University of Electronic Science&Technology of China, China
,
Yueting Zhuang
Zhejiang University, China
,
John R. Smith
IBM, USA
,
Program Chairs:
Yang Yang
University of Electronic Science and Technology of China, China
,
Pablo Cesar
CWI&TU Delft, The Netherlands
,
Florian Metze
FACEBOOK, Inc., USA
,
Balakrishnan Prabhakaran
University of Texas at Dallas, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
generative adversarial networks
image de-distortion
semantically
stacked
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 141
  Total Downloads
- Downloads (Last 12 months)18
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Stacked Semantically-Guided Learning for Image De-distortion

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Perceptual Image Dehazing Based on Generative Adversarial Learning

RCRN: Real-world Character Image Restoration Network via Skeleton Extraction

Overwater Image Dehazing via Cycle-Consistent Generative Adversarial Network