research-article

Sequential Affinity Learning for Video Restoration

Authors:

Mingchao Jiang,

Chenghao XueAuthors Info & Claims

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 4147 - 4156

https://doi.org/10.1145/3581783.3611742

Published: 27 October 2023 Publication History

Abstract

Video restoration networks aim to restore high-quality frame sequences from degraded ones. However, traditional video restoration methods heavily rely on temporal modeling operators or optical flow estimation, which limits their versatility. The aim of this work is to present a novel approach for video restoration that eliminates inefficient temporal modeling operators and pixel-level feature alignment in the network architecture. The proposed method, Sequential Affinity Learning Network (SALN), is designed based on an affinity mechanism that establishes direct correspondences between the Query frame, degraded sequence, and restored frames in latent space. This unique perspective allows for more accurate and effective restoration of video content without relying on temporal modeling operators or optical flow estimation techniques. Moreover, we enhanced the design of the channel-wise self-attention block to improve the decoder's performance for video restoration. Our method outperformed previous state-of-the-art methods by a significant margin in several classic video tasks, including video deraining, video dehazing, and video waterdrop removal, demonstrating excellent efficiency. As a novel network that differs significantly from previous video restoration methods, SALN aims to provide innovative ideas and directions for video restoration. Our contributions include proposing a novel affinity-based approach for video restoration, enhancing the design of the channel-wise self-attention block, and achieving state-of-the-art performance on several classic video tasks.

References

[1]

Jiezhang Cao, Yawei Li, Kai Zhang, Jingyun Liang, and Luc Van Gool. 2021. Video super-resolution transformer. arXiv preprint arXiv:2106.06847 (2021).

[2]

Mingdeng Cao, Yanbo Fan, Yong Zhang, Jue Wang, and Yujiu Yang. 2022. VDTR: Video deblurring with transformer. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 33, 1 (2022), 160--171.

[3]

Wenhao Chai and Gaoang Wang. 2022. Deep vision multimodal learning: Methodology, benchmark, and trend. Applied Sciences, Vol. 12, 13 (2022), 6588.

[4]

Kelvin CK Chan, Xintao Wang, Ke Yu, Chao Dong, and Chen Change Loy. 2021. BasicVSR: The search for essential components in video super-resolution and beyond. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 4947--4956.

[5]

Kelvin CK Chan, Shangchen Zhou, Xiangyu Xu, and Chen Change Loy. 2022. Basicvsr: Improving video super-resolution with enhanced propagation and alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5972--5981.

[6]

Chen Chen, Minh N Do, and Jue Wang. 2016. Robust image and video dehazing with visual artifact suppression via gradient residual minimization. In Proceedings of the European Conference on Computer Vision (ECCV). Springer, 576--591.

[7]

Jie Chen, Cheen-Hau Tan, Junhui Hou, Lap-Pui Chau, and He Li. 2018. Robust video content alignment and compensation for rain removal in a cnn framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 6286--6295.

[8]

Sixiang Chen, Tian Ye, Yun Liu, Erkang Chen, Jun Shi, and Jingchun Zhou. 2022. SnowFormer: Scale-aware Transformer via Context Interaction for Single Image Desnowing. arXiv preprint arXiv:2208.09703 (2022).

[9]

Sixiang Chen, Tian Ye, Yun Liu, Taodong Liao, Jingxia Jiang, Erkang Chen, and Peng Chen. 2023 a. MSP-former: Multi-scale projection transformer for single image desnowing. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1--5.

[10]

Sixiang Chen, Tian Ye, Jun Shi, Yun Liu, JingXia Jiang, Erkang Chen, and Peng Chen. 2023 b. DEHRFormer: Real-Time Transformer for Depth Estimation and Haze Removal from Varicolored Haze Scenes. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1--5.

[11]

Ho Kei Cheng, Yu-Wing Tai, and Chi-Keung Tang. 2021. Rethinking space-time networks with improved memory coverage for efficient video object segmentation. Advances in Neural Information Processing Systems, Vol. 34 (2021), 11781--11794.

[12]

Mark De Berg, Marc Van Kreveld, Mark Overmars, and Otfried Schwarzkopf. 1997. Computational Geometry: Introduction. Springer.

[13]

Hang Dong, Jinshan Pan, Lei Xiang, Zhe Hu, Xinyi Zhang, Fei Wang, and Ming-Hsuan Yang. 2020. Multi-scale boosted dehazing network with dense feature fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2157--2167.

[14]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).

[15]

Kaiming He, Jian Sun, and Xiaoou Tang. 2011. Single image haze removal using dark channel prior. IEEE transactions on pattern analysis and machine intelligence, Vol. 33, 12 (2011), 2341--2353.

[16]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778.

[17]

Ming Hong, Yuan Xie, Cuihua Li, and Yanyun Qu. 2020. Distilling image dehazing with heterogeneous task imitation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 3462--3471.

[18]

Cong Huang, Jiahao Li, Bin Li, Dong Liu, and Yan Lu. 2022a. Neural compression-based feature learning for video restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5872--5881.

[19]

Jie Huang, Yajing Liu, Xueyang Fu, Man Zhou, Yang Wang, Feng Zhao, and Zhiwei Xiong. 2022b. Exposure Normalization and Compensation for Multiple-Exposure Correction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 6043--6052.

[20]

Jie Huang, Yajing Liu, Feng Zhao, Keyu Yan, Jinghao Zhang, Yukun Huang, Man Zhou, and Zhiwei Xiong. 2022c. Deep fourier-based exposure correction network with spatial-frequency interaction. In Proceedings of the European Conference on Computer Vision (ECCV). Springer, 163--180.

Digital Library

[21]

Jie Huang, Zhiwei Xiong, Xueyang Fu, Dong Liu, and Zheng-Jun Zha. 2019. Hybrid Image Enhancement With Progressive Laplacian Enhancing Unit. In Proceedings of the 27th ACM International Conference on Multimedia. 1614--1622.

Digital Library

[22]

Jie Huang, Feng Zhao, Man Zhou, Jie Xiao, Naishan Zheng, Kaiwen Zheng, and Zhiwei Xiong. 2023. Learning Sample Relationship for Exposure Correction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 9904--9913.

[23]

Jie Huang, Man Zhou, Yajing Liu, Mingde Yao, Feng Zhao, and Zhiwei Xiong. 2022d. Exposure-Consistency Representation Learning for Exposure Correction. In Proceedings of the 30th ACM International Conference on Multimedia. 6309--6317.

Digital Library

[24]

Jie Huang, Pengfei Zhu, Mingrui Geng, Jiewen Ran, Xingguang Zhou, Chen Xing, Pengfei Wan, and Xiangyang Ji. 2018. Range Scaling Global U-Net for Perceptual Image Enhancement on Mobile Devices. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops. 230--242.

[25]

Takashi Isobe, Xu Jia, Shuhang Gu, Songjiang Li, Shengjin Wang, and Qi Tian. 2020. Video super-resolution with recurrent structure-detail network. In Proceedings of the European Conference on Computer Vision (ECCV). Springer, 645--660.

Digital Library

[26]

Jingxia Jiang, Jinbin Bai, Yun Liu, Junjie Yin, Sixiang Chen, Tian Ye, and Erkang Chen. 2023 a. RSFDM-Net: Real-time Spatial and Frequency Domains Modulation Network for Underwater Image Enhancement. arXiv preprint arXiv:2302.12186 (2023).

[27]

Jingxia Jiang, Tian Ye, Jinbin Bai, Sixiang Chen, Wenhao Chai, Shi Jun, Yun Liu, and Erkang Chen. 2023 b. Five A + Network: You Only Need 9K Parameters for Underwater Image Enhancement. arXiv preprint arXiv:2305.08824 (2023).

[28]

Tai-Xiang Jiang, Ting-Zhu Huang, Xi-Le Zhao, Liang-Jian Deng, and Yao Wang. 2019. FastDeRain: A novel video rain streak removal method using directional gradient priors. IEEE Transactions on Image Processing, Vol. 28, 4 (2019), 2089--2102.

Digital Library

[29]

Yeying Jin, Ruoteng Li, Wenhan Yang, and Robby T. Tan. 2023. Estimating reflectance layer from a single image: Integrating reflectance guidance and shadow/specular aware learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 1069--1077.

[30]

Yeying Jin, Aashish Sharma, and Robby T. Tan. 2021. DC-ShadowNet: Single-image hard and soft shadow removal using unsupervised domain-classifier guided network. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5027--5036.

[31]

Yeying Jin, Wending Yan, Wenhan Yang, and Robby T. Tan. 2022a. Structure representation network and uncertainty feedback learning for dense non-uniform fog removal. In Proceedings of the Asian Conference on Computer Vision (ACCV). Springer, 155--172.

[32]

Yeying Jin, Wenhan Yang, and Robby T. Tan. 2022b. Unsupervised night image enhancement: When layer decomposition meets light-effects suppression. In Proceedings of the European Conference on Computer Vision (ECCV). Springer, 404--421.

[33]

Yeying Jin, Wenhan Yang, Wei Ye, Yuan Yuan, and Robby T. Tan. 2022c. Shadowdiffusion: Diffusion-based shadow removal using classifier-driven attention and structure preservation. arXiv preprint arXiv:2211.08089 (2022).

[34]

Rafal Jozefowicz, Wojciech Zaremba, and Ilya Sutskever. 2015. An empirical exploration of recurrent network architectures. In Proceedings of the International Conference on Machine Learning. PMLR, 2342--2350.

[35]

Youngwan Lee, Jonghee Kim, Jeff Willette, and Sung Ju Hwang. 2021. MPViT: Multi-Path Vision Transformer for Dense Prediction. arXiv preprint arXiv:2112.11010 (2021).

[36]

Minghan Li, Qi Xie, Qian Zhao, Wei Wei, Shuhang Gu, Jing Tao, and Deyu Meng. 2018. Video rain streak removal by multiscale convolutional sparse coding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 6644--6653.

[37]

Zhuwen Li, Ping Tan, Robby T. Tan, Danping Zou, Steven Zhiying Zhou, and Loong-Fah Cheong. 2015. Simultaneous video defogging and stereo reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4988--4997.

[38]

Jing Lin, Yuanhao Cai, Xiaowan Hu, Haoqian Wang, Youliang Yan, Xueyi Zou, Henghui Ding, Yulun Zhang, Radu Timofte, and Luc Van Gool. 2022. Flow-Guided sparse transformer for video deblurring. arXiv preprint arXiv:2201.01893 (2022).

[39]

Chengxu Liu, Huan Yang, Jianlong Fu, and Xueming Qian. 2022d. Learning trajectory-aware transformer for video super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5687--5696.

[40]

Jiaying Liu, Wenhan Yang, Shuai Yang, and Zongming Guo. 2018. Erase or fill? deep joint recurrent rain removal and reconstruction in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3233--3242.

[41]

Xiaohong Liu, Yongrui Ma, Zhihao Shi, and Jun Chen. 2019a. Griddehazenet: Attention-based multi-scale network for image dehazing. In Proceedings of the IEEE/CVF international conference on computer vision. 7314--7323.

[42]

Xing Liu, Masanori Suganuma, Zhun Sun, and Takayuki Okatani. 2019c. Dual residual networks leveraging the potential of paired operations for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7007--7016.

[43]

Yun Liu, Pengfei Jia, Hao Zhou, and Anzhi Wang. 2022a. Joint dehazing and denoising for single nighttime image via multi-scale decomposition. Multimedia Tools and Applications, Vol. 81 (2022), 23941--23962.

Digital Library

[44]

Yun Liu, Jinxia Shang, Lei Pan, Anzhi Wang, and Minghui Wang. 2019b. A unified variational model for single image dehazing. IEEE Access, Vol. 7 (2019), 15722--15736.

[45]

Yun Liu, Anzhi Wang, Hao Zhou, and Pengfei Jia. 2021. Single nighttime image dehazing based on image decomposition. Signal Processing, Vol. 183 (2021), 107986.

[46]

Yun Liu, Zhongsheng Yan, Sixiang Chen, Tian Ye, Wenqi Ren, and Erkang Chen. 2023 a. NightHazeFormer: Single Nighttime Haze Removal Using Prior Query Transformer. arXiv preprint arXiv:2305.09533 (2023).

[47]

Yun Liu, Zhongsheng Yan, Jinge Tan, and Yuche Li. 2023 b. Multi-Purpose Oriented Single Nighttime Image Haze Removal Based on Unified Variational Retinex Model. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 33, 4 (2023), 1643--1657.

Digital Library

[48]

Yun Liu, Zhongsheng Yan, Aimin Wu, Tian Ye, and Yuche Li. 2022b. Nighttime image dehazing based on variational decomposition model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 640--649.

[49]

Yun Liu, Zhongsheng Yan, Tian Ye, Aimin Wu, and Yuche Li. 2022c. Single nighttime image dehazing based on unified variational decomposition model and multi-scale contrast enhancement. Engineering Applications of Artificial Intelligence, Vol. 116 (2022), 105373.

Digital Library

[50]

Seoung Wug Oh, Joon-Young Lee, Ning Xu, and Seon Joo Kim. 2019. Video object segmentation using space-time memory networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9226--9235.

[51]

Zhenting Qi, Ruike Zhu, Zheyu Fu, Wenhao Chai, and Volodymyr Kindratenko. 2022. Weakly Supervised Two-Stage Training Scheme for Deep Video Fight Detection Model. In 2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 677--685.

[52]

Xu Qin, Zhilin Wang, Yuanchao Bai, Xiaodong Xie, and Huizhu Jia. 2020. FFA-Net: Feature fusion attention network for single image dehazing. In Proceedings of the AAAI Conference on Artificial Intelligence. 11908--11915.

[53]

Ruijie Quan, Xin Yu, Yuanzhi Liang, and Yi Yang. 2021. Removing raindrops and rain streaks in one go. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9147--9156.

[54]

Wenqi Ren, Jingang Zhang, Xiangyu Xu, Lin Ma, Xiaochun Cao, Gaofeng Meng, and Wei Liu. 2019. Deep video dehazing with semantic segmentation. IEEE Transactions on Image Processing, Vol. 28, 4 (2019), 1895--1908.

Digital Library

[55]

Enxin Song, Wenhao Chai, Guanhong Wang, Yucheng Zhang, Haoyang Zhou, Feiyang Wu, Xun Guo, Tian Ye, Yan Lu, Jenq-Neng Hwang, et al. 2023. MovieChat: From Dense Token to Sparse Memory for Long Video Understanding. arXiv preprint arXiv:2307.16449 (2023).

[56]

Matias Tassano, Julie Delon, and Thomas Veit. 2019. DVDnet: A fast network for deep video denoising. In 2019 IEEE International Conference on Image Processing (ICIP). IEEE, 1805--1809.

[57]

Matias Tassano, Julie Delon, and Thomas Veit. 2020. FastDVDnet: Towards real-time deep video denoising without flow estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1354--1363.

[58]

Shuai Wang, Lei Zhu, Huazhu Fu, Jing Qin, Carola-Bibiane Schönlieb, Wei Feng, and Song Wang. 2022b. Rethinking Video Rain Streak Removal: A New Synthesis Model and a Deraining Network with Video Rain Prior. In Proceedings of the European Conference on Computer Vision (ECCV). Springer, 565--582.

Digital Library

[59]

Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018b. Video-to-video synthesis. arXiv preprint arXiv:1808.06601 (2018).

Digital Library

[60]

Xintao Wang, Kelvin CK Chan, Ke Yu, Chao Dong, and Chen Change Loy. 2019. EDVR: Video restoration with enhanced deformable convolutional networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 1954--1963.

[61]

Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018a. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 7794--7803.

[62]

Yusheng Wang, Yunfan Lu, Ye Gao, Lin Wang, Zhihang Zhong, Yinqiang Zheng, and Atsushi Yamashita. 2022a. Efficient video deblurring guided by motion magnitude. In Proceedings of the European Conference on Computer Vision (ECCV). Springer, 413--429.

Digital Library

[63]

Wei Wei, Lixuan Yi, Qi Xie, Qian Zhao, Deyu Meng, and Zongben Xu. 2017. Should we encode rain streaks in video as deterministic or stochastic?. In Proceedings of the IEEE International Conference on Computer Vision. 2516--2525.

[64]

Qiang Wen, Yue Wu, and Qifeng Chen. 2023. Video Waterdrop Removal via Spatio-Temporal Fusion in Driving Scenes. arXiv preprint arXiv:2302.05916 (2023).

[65]

Ren Yang, Xiaoyan Sun, Mai Xu, and Wenjun Zeng. 2019b. Quality-gated convolutional LSTM for enhancing compressed video. In 2019 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 532--537.

[66]

Shuzhou Yang, Moxuan Ding, Yanmin Wu, Zihan Li, and Jian Zhang. 2023 a. Implicit Neural Representation for Cooperative Low-light Image Enhancement. arxiv: 2303.11722 [cs.CV]

[67]

Wenhan Yang, Jiaying Liu, and Jiashi Feng. 2019a. Frame-consistent recurrent video deraining with dual-level flow. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1661--1670.

[68]

Wenhan Yang, Robby T. Tan, Jiashi Feng, Shiqi Wang, Bin Cheng, and Jiaying Liu. 2022. Recurrent multi-frame deraining: Combining physics guidance and adversarial learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 44, 11 (2022), 8569--8586.

[69]

Wenhan Yang, Robby T. Tan, Shiqi Wang, and Jiaying Liu. 2020. Self-learning video rain streak removal: When cyclic consistency meets temporal correspondence. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1720--1729.

[70]

Zizheng Yang, Jie Huang, Jiahao Chang, Man Zhou, Hu Yu, Jinghao Zhang, and Feng Zhao. 2023 b. Visual Recognition-Driven Image Restoration for Multiple Degradation With Intrinsic Semantics Recovery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 14059--14070.

[71]

Tian Ye, Sixiang Chen, Yun Liu, Yi Ye, Jinbin Bai, and Erkang Chen. 2022a. Towards real-time high-definition image snow removal: Efficient pyramid network with asymmetrical encoder-decoder architecture. In Proceedings of the Asian Conference on Computer Vision. 366--381.

[72]

Tian Ye, Sixiang Chen, Yun Liu, Yi Ye, Erkang Chen, and Yuche Li. 2022b. Underwater light field retention: Neural rendering for underwater imaging. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 488--497.

[73]

Tian Ye, Yunchen Zhang, Mingchao Jiang, Liang Chen, Yun Liu, Sixiang Chen, and Erkang Chen. 2022c. Perceiving and Modeling Density for Image Dehazing. In Proceedings of the European Conference on Computer Vision (ECCV). Springer, 130--145.

Digital Library

[74]

Hu Yu, Jie Huang, Yajing Liu, Qi Zhu, Man Zhou, and Feng Zhao. 2022. Source-Free Domain Adaptation for Real-World Image Dehazing. In Proceedings of the 30th ACM International Conference on Multimedia. 6645--6654.

Digital Library

[75]

Zongsheng Yue, Jianwen Xie, Qian Zhao, and Deyu Meng. 2021. Semi-supervised video deraining with dynamical rain generator. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 642--652.

[76]

Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. 2021. Restormer: Efficient Transformer for High-Resolution Image Restoration. arXiv preprint arXiv:2111.09881 (2021).

[77]

Jinghao Zhang, Jie Huang, Mingde Yao, Zizheng Yang, Hu Yu, Man Zhou, and Feng Zhao. 2023. Ingredient-Oriented Multi-Degradation Learning for Image Restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5825--5835.

[78]

Jinghao Zhang, Jie Huang, Mingde Yao, Man Zhou, and Feng Zhao. 2022. Structure- and Texture-Aware Learning for Low-Light Image Enhancement. In Proceedings of the 30th ACM International Conference on Multimedia. 6483--6492.

Digital Library

[79]

Xinyi Zhang, Hang Dong, Jinshan Pan, Chao Zhu, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, and Fei Wang. 2021. Learning To Restore Hazy Video: A New Real-World Dataset and a New Method. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 9239--9248.

[80]

Naishan Zheng, Jie Huang, Feng Zhao, Xueyang Fu, and Feng Wu. 2022. Unsupervised Underexposed Image Enhancement via Self-Illuminated and Perceptual Guidance. IEEE Transactions on Multimedia (2022), 1--16. https://doi.org/10.1109/TMM.2022.3193059

Digital Library

[81]

Zhihang Zhong, Ye Gao, Yinqiang Zheng, and Bo Zheng. 2020. Efficient spatio-temporal recurrent neural network for video deblurring. In Proceedings of the European Conference on Computer Vision (ECCV). Springer, 191--207.

Digital Library

[82]

Hao Zhou, Zekai Chen, Yun Liu, Yongpan Sheng, Wenqi Ren, and Hailing Xiong. 2023. Physical-priors-guided DehazeFormer. Knowledge-Based Systems, Vol. 266 (2023), 110410.

Digital Library

[83]

Hao Zhou, Ze Zhao, Hailing Xiong, and Yun Liu. 2022. A unified weighted variational model for simultaneously haze removal and noise suppression of hazy images. Displays, Vol. 72 (2022), 102137.

[84]

Man Zhou, Jie Huang, Xueyang Fu, Feng Zhao, and Danfeng Hong. 2022a. Effective Pan-Sharpening by Multiscale Invertible Neural Network and Heterogeneous Task Distilling. IEEE Transactions on Geoscience and Remote Sensing, Vol. 60 (2022), 1--14. https://doi.org/10.1109/TGRS.2022.3199210

[85]

Man Zhou, Jie Huang, Chongyi Li, Hu Yu, Keyu Yan, Naishan Zheng, and Feng Zhao. 2022b. Adaptively Learning Low-High Frequency Information Integration for Pan-Sharpening. In Proceedings of the 30th ACM International Conference on Multimedia. 3375--3384.

Digital Library

[86]

Man Zhou, Jie Huang, Keyu Yan, Gang Yang, Aiping Liu, Chongyi Li, and Feng Zhao. 2022c. Normalization-Based Feature Selection and Restitution for Pan-Sharpening. In Proceedings of the 30th ACM International Conference on Multimedia. 3365--3374.

Digital Library

[87]

Man Zhou, Jie Huang, Keyu Yan, Hu Yu, Xueyang Fu, Aiping Liu, Xian Wei, and Feng Zhao. 2022d. Spatial-frequency domain information integration for pan-sharpening. In Proceedings of the European Conference on Computer Vision (ECCV). Springer, 274--291.

Digital Library

[88]

Man Zhou, Keyu Yan, Jie Huang, Zihe Yang, Xueyang Fu, and Feng Zhao. 2022 e. Mutual Information-Driven Pan-Sharpening. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1798--1808.

[89]

Man zhou, Hu Yu, Jie Huang, Feng Zhao, Jinwei Gu, Chen Change Loy, Deyu Meng, and Chongyi Li. 2022. Deep Fourier Up-Sampling. In Advances in Neural Information Processing Systems, Vol. 35. Curran Associates, Inc., 22995--23008.

[90]

Yurui Zhu, Jie Huang, Xueyang Fu, Feng Zhao, Qibin Sun, and Zheng-Jun Zha. 2022. Bijective Mapping Network for Shadow Removal. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5627--5636.

Cited By

Zhou HWang HYe TXing ZMa JLi PWang QZhu LCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Timeline and Boundary Guided Diffusion Network for Video Shadow DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681236(166-175)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681236
Wu HYang YXu HWang WZhou JZhu LCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)RainMamba: Enhanced Locality Learning with State Space Models for Video DerainingProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680916(7881-7890)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680916
Zhao ZChai WWang XLi BHao SCao SYe TWang G(2024)See and Think: Embodied Agent in Virtual EnvironmentComputer Vision – ECCV 202410.1007/978-3-031-73242-3_11(187-204)Online publication date: 29-Oct-2024
https://doi.org/10.1007/978-3-031-73242-3_11

Index Terms

Sequential Affinity Learning for Video Restoration
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks

Recommendations

Optimized contrast enhancement for real-time image and video dehazing

A fast and optimized dehazing algorithm for hazy images and videos is proposed in this work. Based on the observation that a hazy image exhibits low contrast in general, we restore the hazy image by enhancing its contrast. However, the overcompensation ...
Archive film restoration based on spatiotemporal random walks
ECCV'10: Proceedings of the 11th European conference on Computer vision: Part V

We propose a novel restoration method for defects and missing regions in video sequences, particularly in application to archive film restoration. Our statistical framework is based on random walks to examine the spatiotemporal path of a degraded pixel, ...
Phase-based Memory Network for Video Dehazing
MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Video dehazing using deep-learning based methods has just received increasing attention in recent years. However, most existing methods tackle temporal consistency in the color domain only, which are less sensitive to small and imperceptible motions in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

October 2023

9913 pages

ISBN:9798400701085

DOI:10.1145/3581783

General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada & MBZUAI, UAE
,
Tao Mei
HiDream.ai, China
,
Rita Cucchiara
University of Modena and Reggio Emilia, Italy
,
Program Chairs:
Marco Bertini
University of Florence, Italy
,
Diana Patricia Tobon Vallejo
Unversidad de Medellin, Colombia
,
Pradeep K. Atrey
University at Albany, State University of New York, USA
,
M. Shamim Hossain
M. Shamim Hossain (King Saud University, KSA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Xiamen Ocean and Fisheries Development Special Funds
Natural Science Foundation of Fujian Province
Youth Science and Technology Innovation Program of Xiamen Ocean and Fisheries Development Special Funds

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

October 29 - November 3, 2023

Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
262
Total Downloads

Downloads (Last 12 months)152
Downloads (Last 6 weeks)15

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhou HWang HYe TXing ZMa JLi PWang QZhu LCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Timeline and Boundary Guided Diffusion Network for Video Shadow DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681236(166-175)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681236
Wu HYang YXu HWang WZhou JZhu LCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)RainMamba: Enhanced Locality Learning with State Space Models for Video DerainingProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680916(7881-7890)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680916
Zhao ZChai WWang XLi BHao SCao SYe TWang G(2024)See and Think: Embodied Agent in Virtual EnvironmentComputer Vision – ECCV 202410.1007/978-3-031-73242-3_11(187-204)Online publication date: 29-Oct-2024
https://doi.org/10.1007/978-3-031-73242-3_11

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten