Abstract
Image inpainting methods based on deep convolutional neural networks (DCNN), especially generative adversarial networks (GAN), have made tremendous progress, due to their forceful representation capabilities. These methods can generate visually reasonable contents and textures; however, the existing deep models based on a single receptive field type usually not only cause image artifacts and content mismatches but also ignore the correlation between the hole region and long-distance spatial locations in the image. To address the above problems, in this paper, we propose a new generative model based on GAN, which is composed of a two-stage encoder–decoder with a Multi-Scale Encoder Network (MSE-Net) and a new Contextual Attention Model based on the Absolute Value (CAM-AV). The former utilizes different-size convolution kernels to encode features, which improves the ability of abstract feature characterization. The latter uses a new search algorithm to enhance the matching of features in the network. Our network is a fully convolutional network that can complete holes of arbitrary size, number, and spatial location in the image. Experiments with regular and irregular inpainting on different datasets including CelebA and Places2 demonstrate that the proposed method achieves higher quality inpainting results with reasonable contents than the most existing state-of-the-art methods.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Haouchine, N., Roy, F., Courtecuisse, H., Niebner, H., Cotidn, M., Calipso, S.: Physics-based image and video editing through CADmodel proxies. Vis. Comput. 36, 211–226 (2020)
Zhang, S., He, F.: Drcdn: learning deep residual convolutional dehazing networks. Vis. Comput. 36, 43–52 (2019)
Zhang, S., He, F., Ren, W., Yao, J.: Joint learning of image detail and transmission map for single image dehazing. Vis. Comput. 36(2), 305–316 (2020)
Wang, C., Chan, S., Zhu, Z., Zhang, L., Shum, H.: Superpixel-based color-depth restoration and dynamic environment modelingfor Kinect-assisted image-based rendering systems. Vis. Comput. 34, 67–81 (2018)
Pandey, G., Ghanekar, U.: Classification of priors and regularization techniques appurtenant to single image super-resolution. Vis. Comput. 36(6), 1291–1304 (2019)
Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.: PatchMatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28(3), 24–34 (2009)
Qiang, G., Shanshan, G., Xiaofeng, Z., Yilong, Y., Caiming, Z.: Patch-based image inpainting via two-stage low rank approximation. IEEE Trans. Vis. Comput. Graph. 24(6), 2023–2036 (2018)
Haodong, L., Weiqi, L., Jiwu, H.: Localization of diffusion-based inpainting in digital images. IEEE Trans. Inf. Forensics Secur. 12(12), 3050–3063 (2017)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y. Generative adversarial nets. In: Advances in Neural Information Processing Systems. pp. 2672–2680(2014)
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A. Context encoders: feature learning by inpainting. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. pp. 2536–2544(2016)
Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. 36(4), 1–14 (2017)
Chao, Y., Xin, L., Zhe, L., Eli, S., Oliver, W., Hao, Li. High-resolution image inpainting using multi-scale neural patch synthesis. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. pp. 4076–4084(2017)
Yeh, A., Chen, C., Lim, T., Schwing, A., Hasegawa-Johnson, M. Semantic image inpainting with deep generative models. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. pp. 6882–6890 (2017)
Yuhang, S., Chao, Y., Zhe, L., Xiaofeng, L., Hao, L., Qin, H., C. C. Jay, K. Contextual based image inpainting: Infer, match and translate. In: Proceedings of European Conference on Computer Vision. pp. 665–682(2018)
Guilin, L., Fitsum A, R., Kevin J, S., Ting-Chun, W., Bryan, C. Image inpainting for irregular holes using partial convolutions. In: Proceedings of European Conference on Computer Vision. pp. 1221–1239(2018)
Kamyar, N., Eric, N., Tony, J., Faisal Z, Q., Mehran, E. Edgeconnect: Generative image inpainting with adversarial edge learning (2019). In: arXiv 1901.00212v3
Xie, C., Liu, S., Li, C., Cheng, M., Ding, E. Image inpainting with learnable bidirectional attention maps. In: Proceedings of IEEE International Conference on Computer Vision. pp. 2510–2520(2019)
Jiahui, Y., Zhe, L., Jimei, Y., Xiaohui, S., Xin, L., Thomas S, H. Generative image inpainting with contextual attention. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. pp. 5505–5514(2018)
Ishaan, G., Faruk, A., Martin, A., Vincent, D., Aaron, C. Improved training of wasserstein gans. In: Advances in Neural Information Processing Systems. pp. 5767–5777(2017)
Jiahui, Y., Zhe, L., Jimei, Y., Xiaohui, S., Xin, L., Thomas, H. Free-form image inpainting with gated convolution. In: Proceedings of IEEE International Conference on Computer Vision. pp. 4471–4480(2019)
Yanhong, Z., Jianlong, F., Hongyang, C., Baining, G. Learning pyramid-context encoder network for high-quality image inpainting. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. pp. 1486–1494(2019)
Hongyu, L., Bin, J., Yi, X., Chao, Y. Coherent semantic attention for image inpainting. In: Proceedings of IEEE International Conference on Computer Vision. pp. 4170–4179(2019)
Zhaoyi, Y., Xiaoming, L., Mu, L., Wangmeng, Z., Shiguang, S. Shift-net: Image inpainting via deep feature rearrangement. In: Proceedings of European Conference on Computer Vision. pp. 1153–1178(2018)
Qingguo, X., Guangyao, L., Qiaochuan, C. Deep inception generative network for cognitive image inpainting. In: arXiv 1812.01458v1
Hoang, T., Truyen, T., Svetha, V. Improving generalization and stability of generative adversarial networks. In: International Conference on Learning Representations. pp. 1–18(2019)
Coloma, B., Bertalmio, M., Caselles, V., Guillermo, S., Joan, V.: Filling-in by joint interpolation of vector fields and gray levels. IEEE Trans. Image Process. 10(8), 1200–1211 (2001)
Xiao, J., Yuting, S., Liang, Z., Yongwei, W., Peiguang, J.Z., Jane, W.: Sparsity-based image inpainting detection via canonical correlation analysis with low-rank constraints. IEEE Access 6, 49967–49978 (2018)
Kangshun, L., Yunshan, W., Zhen, Y., Wenhua, W.: Image inpainting algorithm based on tv model and evolutionary algorithm. Soft Comput. 20(3), 885–893 (2016)
Jiangchun, M., Yucai, Z.: The research of image inpainting algorithm using self-adaptive group structure and sparse representation. Cluster Comput. 22(3), 7593–7601 (2019)
Sridevi, G., Srinivas Kumar, S.: Image inpainting based on fractional-order nonlinear diffusion for image reconstruction. Circuits Syst. Signal Process. 38, 3802–3817 (2019)
Selim, E.: Digital inpainting based on the mumford-shah-euler image model. Eur. J. Appl. Math. 13(4), 353–370 (2003)
Mrinmoy, G., Soumitra, S., Sekhar, M., Bhabatosh, C.: Multiple pyramids based image inpainting using local patch statistics and steering kernel feature. IEEE Trans. Image Process. 28(11), 5495–5509 (2019)
Vahid, K.A., Farzin, Y.: Exemplar-based image inpainting using svd-based approximation matrix and multi-scale analysis. Multimed. Tools Appl. 76(5), 7213–7234 (2017)
Yunqiang, L., Vicent, C.: Exemplar-based image inpainting using multiscale graph cuts. IEEE Trans. Image Process. 22(5), 1699–1711 (2013)
Tijana, R., Aleksandra, P.: Context-aware patch-based image inpainting using markov random field modeling. IEEE Trans. Image Process. 24(1), 444–456 (2015)
Liu, B., Li, P., Sheng, B., Nie, Y., Wu, E.: Structure-preserving image completion with multi-level dynamic patches. Vis. Comput. 35, 85–98 (2019)
Yuantao, C., Linwu, L., Jiajun, T., Runlong, X., Qian, Z., Kai, Y., Jie, X., Xi, C.: The improved image inpainting algorithm via encoder and similarity constraint. Vis. Comput. (2020). https://doi.org/10.1007/s00371-020-01932-3
Yuqing, M., Xiaolong, L., Shihao, B., Lei, W., Aishan, L. Coarse-to-fine image inpainting via region-wise convolutions and non-local correlation. In: Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-2019)
Wei, X., Jiahui, Y., Zhe, L., Jimei, Y., Xin, L. Foreground-aware image inpainting. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. pp. 5833–5841(2019)
Yuhang, S., Chao, Y., Yeji, S., Peng, W., C. C. Jay, K. Spg-net: Segmentation prediction and guidance network for image inpainting. In: British Machine Vision Conference. pp. 1521–1531(2018)
Zheng, H., Jie, L., Xinbo G., Xiumei, W. Image fine-grained inpainting. In: Proceedings of European Conference on Computer Vision. pp. 1–11(2020)
Chu-Tak, L., Wan-Chi, S., Zhi-Song, L., Li-Wen, W., Daniel Pak-Kong, L. DeepGIN: deep generative inpainting network for extreme image inpainting. In: Proceedings of European Conference on Computer Vision. pp. 656–672(2020)
Zili, Y., Qiang, T., Shekoofeh, A., Daesik, J., Zhan, X. Contextual residual aggregation for ultra high-resolution image inpainting. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. pp. 7508–7517(2020)
Junyuan, X., Linli, X., Enhong, C.: Image denoising and inpainting with deep neural networks. Adv. Neural Inf. Process. Syst. 25, 2341–2349 (2012)
Chao, Y., Xin, L., Zhe, L., Eli, S., Oliver, W., Hao, L. High-resolution image inpainting using multi-scale neural patch synthesis. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. pp. 4076–4084(2017)
Mehdi, M., Simon, O. Conditional generative adversarial nets(2014). In: arXiv 1411.1784v1
Alec, R., Luke, M., Soumith, C. Unsupervised representation learning with deep convolutional generative adversarial networks. Comput. Sci. (2015)
Martin, A., Soumith, C., L´eon, B. Wasserstein gan(2017). In: arXiv 1701.07875v3
Takeru, M., Toshiki, K., Masanori, K., Yuichi, Y. Spectral normalization for generative adversarial networks. In: International Conference on Learning Representations. pp. 1–26(2018)
Mingyu, L., Oncel, T. Coupled generative adversarial networks. In: Advances in Neural Information Processing Systems. pp. 469–477(2016)
Wenzhe, S., Jose, C., Ferenc, H., Johannes, T., Andrew P, A., Rob, B., Daniel, R., Zehan, W. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. pp. 1874–1883(2016)
Xiaogang, W., Ziwei, L., Ping, L., Xiaoou, T. Deep learning face attributes in the wild. In: Proceedings of IEEE International Conference on Computer Vision. pp. 3730–3738(2015)
Bolei, Z., Agata, L., Aditya, K., Aude, O., Antonio, T.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2018)
Martin, H., Hubert, R., Thomas, U., Bernhard, N., Sepp, H. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems. pp. 1341–1378(2018)
Acknowledgements
The authors would like to thank CelebA and Places2 datasets, which allowed us to train and evaluate the proposed model. The authors would also like to thank all the reviewers for their insightful comments.
Funding
This work was supported by the National Natural Science Foundation of China under Grant 61674049 and U19A2053 and the Fundamental Research Funds for the Central Universities of China under Grant JZ2019HGTB0092, JZ2020YYPY0089 and JZ2020HGTA0085.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yang, Y., Cheng, Z., Yu, H. et al. MSE-Net: generative image inpainting with multi-scale encoder. Vis Comput 38, 2647–2659 (2022). https://doi.org/10.1007/s00371-021-02143-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-021-02143-0