research-article

Taming Vector-Wise Quantization for Wide-Range Image Blending with Smooth Transition

Authors:
Zeyu Wang

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China

0000-0003-0985-4478
View Profile

,
Haibin Shen

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China

0000-0002-5431-609X
View Profile

,
Kejie Huang

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China

0000-0003-3722-9979
View Profile

McGE '23: Proceedings of the 1st International Workshop on Multimedia Content Generation and Evaluation: New Methods and PracticeOctober 2023Pages 67–74https://doi.org/10.1145/3607541.3616809

Published:29 October 2023Publication History

McGE '23: Proceedings of the 1st International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice

Pages 67–74

ABSTRACT

Wide-range image blending is a novel image processing technique that merges two different images into a panorama with a transition region. Conventional image inpainting and outpainting methods have been used to complete this task, but always create significant distorted and blurry structures. The State-Of-The-Art (SOTA) method uses a U-Net-like model with a feature prediction module for content inference. However, it fails to generate panoramas with smooth transitions and visual realness, particularly when the input images have distinct scenery features. It indicates that the predicted features may deviate from the natural latent distribution of authentic images. In this paper, we propose an effective deep-learning model that integrates vector-wise quantization for feature prediction. This approach searches for the most-like latent features from a discrete codebook, resulting in high-quality wide-range image blending. In addition, we propose to use the global-local discriminator for adversarial training to improve the predicted content quality and smooth the transition. Our experiments demonstrate that our method generates visually appealing panoramic images and outperforms baseline approaches on the Scenery6000 dataset.

References

Naoufal Amrani, Joan Serra-Sagristà, Pascal Peter, and Joachim Weickert. 2017. Diffusion-based inpainting for coding remote-sensing data. IEEE Geoscience and Remote Sensing Letters , Vol. 14, 8 (2017), 1203--1207.Google ScholarCross Ref
Marcelo Bertalmio, Guillermo Sapiro, Vincent Caselles, and Coloma Ballester. 2000. Image inpainting. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques. 417--424.Google ScholarDigital Library
Mikołaj Bi'nkowski, Danica J Sutherland, Michael Arbel, and Arthur Gretton. 2018. Demystifying mmd gans. arXiv preprint arXiv:1801.01401 (2018).Google Scholar
Yen-Chi Cheng, Chieh Hubert Lin, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, and Ming-Hsuan Yang. 2022. InOut: Diverse Image Outpainting via GAN Inversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11431--11440.Google ScholarCross Ref
Maxime Daisy, David Tschumperlé, and Olivier Lézoray. 2013. A fast spatial patch blending algorithm for artefact reduction in pattern-based image inpainting. In SIGGRAPH Asia 2013 Technical Briefs. 1--4.Google Scholar
Patrick Esser, Robin Rombach, and Bjorn Ommer. 2021. Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 12873--12883.Google ScholarCross Ref
Penglei Gao, Xi Yang, Rui Zhang, Kaizhu Huang, and Yujie Geng. 2022. Generalised Image Outpainting with U-Transformer. arXiv preprint arXiv:2201.11403 (2022).Google Scholar
Dongsheng Guo, Hongzhi Liu, Haoru Zhao, Yunhao Cheng, Qingwei Song, Zhaorui Gu, Haiyong Zheng, and Bing Zheng. 2020. Spiral generative network for image extrapolation. In European Conference on Computer Vision. Springer, 701--717.Google ScholarDigital Library
Qiang Guo, Shanshan Gao, Xiaofeng Zhang, Yilong Yin, and Caiming Zhang. 2017. Patch-based image inpainting via two-stage low rank approximation. IEEE transactions on visualization and computer graphics, Vol. 24, 6 (2017), 2023--2036.Google Scholar
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems , Vol. 30 (2017).Google ScholarDigital Library
Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2017. Globally and locally consistent image completion. ACM Transactions on Graphics (ToG) , Vol. 36, 4 (2017), 1--14.Google ScholarDigital Library
Haodong Li, Weiqi Luo, and Jiwu Huang. 2017. Localization of diffusion-based inpainting in digital images. IEEE transactions on information forensics and security, Vol. 12, 12 (2017), 3050--3064.Google ScholarDigital Library
Han Lin, Maurice Pagnucco, and Yang Song. 2021. Edge guided progressively generative image outpainting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 806--815.Google ScholarCross Ref
Guilin Liu, Fitsum A Reda, Kevin J Shih, Ting-Chun Wang, Andrew Tao, and Bryan Catanzaro. 2018. Image inpainting for irregular holes using partial convolutions. In Proceedings of the European conference on computer vision (ECCV). 85--100.Google ScholarDigital Library
Hongyu Liu, Bin Jiang, Yi Xiao, and Chao Yang. 2019. Coherent semantic attention for image inpainting. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4170--4179.Google ScholarCross Ref
Chia-Ni Lu, Ya-Chu Chang, and Wei-Chen Chiu. 2021. Bridging the visual gap: Wide-range image blending. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 843--851.Google ScholarCross Ref
Xudong Mao, Qing Li, Haoran Xie, Raymond YK Lau, Zhen Wang, and Stephen Paul Smolley. 2017. Least squares generative adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2794--2802.Google ScholarCross Ref
Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, and Alexei A Efros. 2016. Context encoders: Feature learning by inpainting. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2536--2544.Google ScholarCross Ref
Yurui Ren, Xiaoming Yu, Ruonan Zhang, Thomas H Li, Shan Liu, and Ge Li. 2019. Structureflow: Image inpainting via structure-aware appearance flow. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 181--190.Google ScholarCross Ref
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention. Springer, 234--241.Google ScholarCross Ref
Tijana Ruvz ić and Aleksandra Pivz urica. 2014. Context-aware patch-based image inpainting using Markov random field modeling. IEEE transactions on image processing , Vol. 24, 1 (2014), 444--456.Google Scholar
Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR , Vol. abs/1409.1556 (2015).Google Scholar
Aaron Van Den Oord, Oriol Vinyals, et al. 2017. Neural discrete representation learning. Advances in neural information processing systems , Vol. 30 (2017).Google Scholar
Ziyu Wan, Jingbo Zhang, Dongdong Chen, and Jing Liao. 2021. High-fidelity pluralistic image completion with transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4692--4701.Google ScholarCross Ref
Yi Wang, Xin Tao, Xiaoyong Shen, and Jiaya Jia. 2019. Wide-context semantic image extrapolation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1399--1408.Google ScholarCross Ref
Huikai Wu, Shuai Zheng, Junge Zhang, and Kaiqi Huang. 2019. Gp-gan: Towards realistic high-resolution image blending. In Proceedings of the 27th ACM international conference on multimedia. 2487--2495.Google ScholarDigital Library
Zongxin Yang, Jian Dong, Ping Liu, Yi Yang, and Shuicheng Yan. 2019. Very long natural scenery image prediction by outpainting. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10561--10570.Google ScholarCross Ref
Zili Yi, Qiang Tang, Shekoofeh Azizi, Daesik Jang, and Zhan Xu. 2020. Contextual residual aggregation for ultra high-resolution image inpainting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7508--7517.Google ScholarCross Ref
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. 2018. Generative image inpainting with contextual attention. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5505--5514.Google ScholarCross Ref
Yanhong Zeng, Jianlong Fu, Hongyang Chao, and Baining Guo. 2019. Learning pyramid-context encoder network for high-quality image inpainting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1486--1494.Google ScholarCross Ref
Yu Zeng, Zhe Lin, Jimei Yang, Jianming Zhang, Eli Shechtman, and Huchuan Lu. 2020. High-resolution image inpainting with iterative confidence feedback and guided upsampling. In European conference on computer vision. Springer, 1--17.Google ScholarDigital Library
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586--595.Google ScholarCross Ref
Chuanxia Zheng, Tat-Jen Cham, and Jianfei Cai. 2019. Pluralistic image completion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1438--1447. ioGoogle ScholarCross Ref

Index Terms

Taming Vector-Wise Quantization for Wide-Range Image Blending with Smooth Transition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

A quantization based codebook formation method of vector quantization algorithm to improve the compression ratio while preserving the visual quality of the decompressed image
Abstract
An extremely difficult problem in an image compression method is increasing the compression ratio while maintaining the visual quality of the image. One of the popular imae compression strategies that can be found in literature is vector ...
Read More
Vector restoration for video coding
ICIP '95: Proceedings of the 1995 International Conference on Image Processing (Vol. 1)-Volume 1 - Volume 1

We present a novel concept, vector restoration, for motion compensated predictive coding of video. Conventional predictive coders generate predictions of the input signal and encode the residual signal. Here, the predicted images are operated upon ...
Read More
Visually imperceptible image hiding scheme based on vector quantization

A novel visually imperceptible image hiding scheme based on VQ compression and LSB modification is introduced in this article. Multiple secret images can be simultaneously and imperceptibly hid into another cover image with the same image size by the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
McGE '23: Proceedings of the 1st International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice
October 2023
151 pages
ISBN:9798400702785
DOI:10.1145/3607541
General Chairs:
Cheng Jin
Professor, Fudan University, China
,
Liang He
Professor, East China Normal University, China
,
Mingli Song
Professor, Zhejiang University, China
,
Rui Wang
Professor, IIE, Chinese Academy of Sciences, China
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 October 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
codebook
global-local discriminator.
vector-wise quantization
wide-range image blending
Qualifiers
- research-article
Conference
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 34
  Total Downloads
- Downloads (Last 12 months)34
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Taming Vector-Wise Quantization for Wide-Range Image Blending with Smooth Transition

McGE '23: Proceedings of the 1st International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice

ABSTRACT

References

Cited By

Index Terms

Recommendations

A quantization based codebook formation method of vector quantization algorithm to improve the compression ratio while preserving the visual quality of the decompressed image

Vector restoration for video coding

Visually imperceptible image hiding scheme based on vector quantization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Taming Vector-Wise Quantization for Wide-Range Image Blending with Smooth Transition

McGE '23: Proceedings of the 1st International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice

ABSTRACT

References

Cited By

Index Terms

Recommendations

A quantization based codebook formation method of vector quantization algorithm to improve the compression ratio while preserving the visual quality of the decompressed image

Vector restoration for video coding

Visually imperceptible image hiding scheme based on vector quantization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media