Abstract
Object removal is a popular image manipulation technique, which mainly involves object segmentation and image inpainting two technical problems. In the conventional object removal framework, the object segmentation part needs a mask or artificial pre-processing; and the inpainting technique still requires further improving the quality. In this paper, we propose a new framework of object removal using the techniques of deep learning. Conditional random fields as recurrent neural networks (CRF-RNN) is used to segment the target in sematic, which can avoid the trouble of mask or artificial pre-processing for object segmentation. In inpainting part, a new method for inpainting the missing region is proposed. Besides, the representation features are calculated from the convolutional neural network (CNN) feature maps of the neighbor regions of the missing region. Then, large-scale bound-constrained optimization (L-BFGS) is used to synthesize the missing region based on the CNN representation features of similarity neighbor regions. We evaluate the proposed method by applying it to different kinds of images and textures for object removal and inpainting. Experimental results demonstrate that our method is better than the conventional method in terms of inpainting applications and object removal.
Similar content being viewed by others
References
Li, Z., Tang, J.: Weakly supervised deep matrix factorization for social image understanding. IEEE Trans. Image Process. 26(1), 276–288 (2017)
Girshick, R., Donahue, J., Darrell, T., et al.: Region-based convolutional networks for accurate object detection and segmentation. Pattern Anal. Mach. Intell. IEEE Trans. 38(1), 142–158 (2016)
Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189C2202 (2012)
Han, J., Zhang, D., Cheng, G., Guo, L., Ren, J.: Object detection in optical remote sensing images based on weakly supervised learning and high-level feature learning. IEEE Trans. Geosci. Remote Sens. 53(6), 3325–3337 (2015)
Zhang, D., Han, J., Li, C., Wang, J., Li, X.: Detection of co-salient objects by looking deep and wide. Int. J. Comput. Vis. 120(2), 215–232 (2016)
Zhang, D., Han, J., Han, J., Shao, L.: Cosaliency detection based on intrasaliency prior transfer and deep intersaliency mining. IEEE Trans. Neural Netw. Learn. Syst. 27(6), 1163–1176 (2016)
Zheng, S., Jayasumana, S., Romera-Paredes, B.,Vineet, V., Su, Z., Du, D.: Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 1529–1537 (2015)
Darabi, S., Shechtman, E., Barnes, C., Goldman, D. B., Sen, P.: Image melding: combining inconsistent images using patch -based synthesis. Trans. Gr. 31(3), article 82 (2012)
Liang, Z., Yang, G., Ding, X., et al.: An efficient forgery detection algorithm for object removal by exemplar-based image inpainting. J. Vis. Commun. Image Represent. 30, 75–85 (2015)
Ruzic, T., Pizurica, A.: Context-aware patch-based image inpainting using Markov random field modeling. Image Process. IEEE Trans. 24(1), 444–456 (2015)
Carreira, J., Sminchisescu, C.: CPMC: automatic object segmentation using constrained parametric min-cuts. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1312C1328 (2012)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105 (2012)
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: Closing the gap to human-level performance in face verification. In: Computer vision and pattern recognition (CVPR),2014 IEEE conference on, 1701C1708 (2014)
Cheng, G., Zhou, P., Han, J.: Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Trans. Geosci. Remote Sens. 54(12), 7405–7415 (2016)
Yao, X., Han, J., Cheng, G., Qian, X., Guo, L.: Semantic annotation of high-resolution satellite images via weakly supervised learning. IEEE Trans. Geosci. Remote Sens. 54(6), 3660–3671 (2016)
Gatys, L.A., Ecker, A.S., Bethge, M.A.: Neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Simonyan, K., Vedaldi, A., & Zisserman, A.: Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013)
Pathak, D., Krahenbuhl, P., Donahue, J., et al.: Context encoders: Feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2536–2544 (2016)
Li, Z., Tang, J.: Weakly supervised deep metric learning for community-contributed image retrieval. IEEE Trans. Multimed. 17(11), 1989–1999 (2015)
Cadieu, C.F., Hong, H., Yamins, D.L.K., et al.: Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS Comput. Biol. 10(12), e1003963 (2014)
Gl, U., van Gerven, M.A.J.: Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J. Neurosci. 35(27), 10005–10014 (2015)
Khaligh-Razavi, S.M., Kriegeskorte, N.: Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS comput. biol. 10(11), e1003915 (2014)
Paris, S., Durand, F.: A fast approximation of the bilateral filter using a signal processing approach. IJCV 81(1), 24C52 (2013)
Afonso, M.V., BioucasDias, J.M., Figueiredo, M.A.T.: An augmented Lagrangian approach to the constrained optimization formulation of imaging inverse problems. IEEE Trans. Image Process. 20(3), 681 (2011)
Hu, Y., Zhang, D., Ye, J., Li, X., He, X.: Fast and accurate matrix completion via truncated nuclear norm regularization. IEEE Trans. Pattern Anal. Mach. Intell. 35(9), 2117 (2013)
Barnes, C., Shechtman, E., Dan, B. G., Dan, B.G.: PatchMatch: a randomized correspondence algorithm for structural image editing. ACM SIGGRAPH 28, 24 (2009)
Zhu, J.Y., Kr?henbhl, P., Shechtman, E., Efros, A.A.: Generative visual manipulation on the natural image manifold. Comput. Vis. ECCV 2016. Springer, Berlin (2016)
Kingma, D. P., Welling, M.: Auto-encoding variational bayes. arXiv preprint 1312.6114 arXiv:1312.6114 (2013)
Gatys, L., Ecker, A.S., Bethge, M.: Texture synthesis using convolutional neural networks. In: Advances in Neural Information Processing Systems, pp 262–270 (2015)
Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., Vedaldi, A.: Describing textures in the wild. In: IEEE Conference in Computer Vision and Pattern Recognition (CVPR), pp 3606–3613 (2014)
Cimpoi, M., Maji, S., Kokkinos, I., Vedaldi, A.: Deep filter banks for texture recognition, description, and segmentation. Inter. J. Comput. Vis. 118(1), 65–94 (2016)
Zhu, S., Ma, K.-K.: A new diamond search algorithm for fast block matching motion estimation. Image Process. IEEE Trans. 9(2), 287–290 (2000)
Zhu, C., Byrd, R.H., Lu, P., Nocedal, J.: Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Softw. TOMS 23(4), 550C560 (1997)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on Multimedia. ACM, Orlando, Florida, USA. pp 675–678 (2014)
Heeger, D.J., Bergen, J.R.: Pyramid-based texture analysis/synthesis. In: Proceedings of the 22nd annual conference on Computer graphics and interactive techniques. pp 229–238 (1995)
Portilla, J., Simoncelli, P.: A parametric texture model based on joint statistics of complex wavelet coefficients. Int. J. Comput. Vis. 40(1), 49–71 (2000)
Xie, X., Tian, F., Seah, H.S.: Feature guided texture synthesis (fgts) for artistic style transfer. In: Proceedings of the 2nd international conference on Digital interactive media in entertainment and arts. pp 44–49 (2007)
Criminisi, A., Prez, P., Toyama, K.: Region filling and object removal by exemplar-based image inpainting. IEEE Trans. image Process. 13(9), 1200–1212 (2004)
Hays, J., Efros, A.A.: Scene completion using millions of photographs. Commun. ACM 51(10), 87–94 (2008)
Everingham, M., Eslami, S. A., Van Gool, L., Williams, C. K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: A retrospective. Int. J. comput. vis. 111(1), 98–136 (2015)
Acknowledgements
We thank the anonymous reviewers and the editor for their valuable comments. This work has been supported by The National Natural Science Foundation of China (Nos. 61772387 and 61372068), the Research Fund for the Doctoral Program of Higher Education of China (No. 20130203110005), the Fundamental Research Funds for the Central Universities (No. K5051301033), the 111 Project (No. B08038), and also supported by the ISN State Key Laboratory.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by C. Xu.
Rights and permissions
About this article
Cite this article
Cai, X., Song, B. Semantic object removal with convolutional neural network feature-based inpainting approach. Multimedia Systems 24, 597–609 (2018). https://doi.org/10.1007/s00530-018-0585-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-018-0585-x