Skip to main content
Log in

Semantic object removal with convolutional neural network feature-based inpainting approach

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Object removal is a popular image manipulation technique, which mainly involves object segmentation and image inpainting two technical problems. In the conventional object removal framework, the object segmentation part needs a mask or artificial pre-processing; and the inpainting technique still requires further improving the quality. In this paper, we propose a new framework of object removal using the techniques of deep learning. Conditional random fields as recurrent neural networks (CRF-RNN) is used to segment the target in sematic, which can avoid the trouble of mask or artificial pre-processing for object segmentation. In inpainting part, a new method for inpainting the missing region is proposed. Besides, the representation features are calculated from the convolutional neural network (CNN) feature maps of the neighbor regions of the missing region. Then, large-scale bound-constrained optimization (L-BFGS) is used to synthesize the missing region based on the CNN representation features of similarity neighbor regions. We evaluate the proposed method by applying it to different kinds of images and textures for object removal and inpainting. Experimental results demonstrate that our method is better than the conventional method in terms of inpainting applications and object removal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Li, Z., Tang, J.: Weakly supervised deep matrix factorization for social image understanding. IEEE Trans. Image Process. 26(1), 276–288 (2017)

    Article  MathSciNet  Google Scholar 

  2. Girshick, R., Donahue, J., Darrell, T., et al.: Region-based convolutional networks for accurate object detection and segmentation. Pattern Anal. Mach. Intell. IEEE Trans. 38(1), 142–158 (2016)

    Article  Google Scholar 

  3. Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2189C2202 (2012)

    Article  Google Scholar 

  4. Han, J., Zhang, D., Cheng, G., Guo, L., Ren, J.: Object detection in optical remote sensing images based on weakly supervised learning and high-level feature learning. IEEE Trans. Geosci. Remote Sens. 53(6), 3325–3337 (2015)

    Article  Google Scholar 

  5. Zhang, D., Han, J., Li, C., Wang, J., Li, X.: Detection of co-salient objects by looking deep and wide. Int. J. Comput. Vis. 120(2), 215–232 (2016)

    Article  MathSciNet  Google Scholar 

  6. Zhang, D., Han, J., Han, J., Shao, L.: Cosaliency detection based on intrasaliency prior transfer and deep intersaliency mining. IEEE Trans. Neural Netw. Learn. Syst. 27(6), 1163–1176 (2016)

    Article  MathSciNet  Google Scholar 

  7. Zheng, S., Jayasumana, S., Romera-Paredes, B.,Vineet, V., Su, Z., Du, D.: Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 1529–1537 (2015)

  8. Darabi, S., Shechtman, E., Barnes, C., Goldman, D. B., Sen, P.: Image melding: combining inconsistent images using patch -based synthesis. Trans. Gr. 31(3), article 82 (2012)

    Article  Google Scholar 

  9. Liang, Z., Yang, G., Ding, X., et al.: An efficient forgery detection algorithm for object removal by exemplar-based image inpainting. J. Vis. Commun. Image Represent. 30, 75–85 (2015)

    Article  Google Scholar 

  10. Ruzic, T., Pizurica, A.: Context-aware patch-based image inpainting using Markov random field modeling. Image Process. IEEE Trans. 24(1), 444–456 (2015)

    Article  MathSciNet  Google Scholar 

  11. Carreira, J., Sminchisescu, C.: CPMC: automatic object segmentation using constrained parametric min-cuts. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1312C1328 (2012)

    Article  Google Scholar 

  12. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105 (2012)

  13. Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: Closing the gap to human-level performance in face verification. In: Computer vision and pattern recognition (CVPR),2014 IEEE conference on, 1701C1708 (2014)

  14. Cheng, G., Zhou, P., Han, J.: Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Trans. Geosci. Remote Sens. 54(12), 7405–7415 (2016)

    Article  Google Scholar 

  15. Yao, X., Han, J., Cheng, G., Qian, X., Guo, L.: Semantic annotation of high-resolution satellite images via weakly supervised learning. IEEE Trans. Geosci. Remote Sens. 54(6), 3660–3671 (2016)

    Article  Google Scholar 

  16. Gatys, L.A., Ecker, A.S., Bethge, M.A.: Neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015)

  17. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  18. Simonyan, K., Vedaldi, A., & Zisserman, A.: Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013)

  19. Pathak, D., Krahenbuhl, P., Donahue, J., et al.: Context encoders: Feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2536–2544 (2016)

  20. Li, Z., Tang, J.: Weakly supervised deep metric learning for community-contributed image retrieval. IEEE Trans. Multimed. 17(11), 1989–1999 (2015)

    Article  Google Scholar 

  21. Cadieu, C.F., Hong, H., Yamins, D.L.K., et al.: Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS Comput. Biol. 10(12), e1003963 (2014)

    Article  Google Scholar 

  22. Gl, U., van Gerven, M.A.J.: Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J. Neurosci. 35(27), 10005–10014 (2015)

    Article  Google Scholar 

  23. Khaligh-Razavi, S.M., Kriegeskorte, N.: Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS comput. biol. 10(11), e1003915 (2014)

    Article  Google Scholar 

  24. Paris, S., Durand, F.: A fast approximation of the bilateral filter using a signal processing approach. IJCV 81(1), 24C52 (2013)

    Google Scholar 

  25. Afonso, M.V., BioucasDias, J.M., Figueiredo, M.A.T.: An augmented Lagrangian approach to the constrained optimization formulation of imaging inverse problems. IEEE Trans. Image Process. 20(3), 681 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  26. Hu, Y., Zhang, D., Ye, J., Li, X., He, X.: Fast and accurate matrix completion via truncated nuclear norm regularization. IEEE Trans. Pattern Anal. Mach. Intell. 35(9), 2117 (2013)

    Article  Google Scholar 

  27. Barnes, C., Shechtman, E., Dan, B. G., Dan, B.G.: PatchMatch: a randomized correspondence algorithm for structural image editing. ACM SIGGRAPH 28, 24 (2009)

  28. Zhu, J.Y., Kr?henbhl, P., Shechtman, E., Efros, A.A.: Generative visual manipulation on the natural image manifold. Comput. Vis. ECCV 2016. Springer, Berlin (2016)

  29. Kingma, D. P., Welling, M.: Auto-encoding variational bayes. arXiv preprint 1312.6114 arXiv:1312.6114 (2013)

  30. Gatys, L., Ecker, A.S., Bethge, M.: Texture synthesis using convolutional neural networks. In: Advances in Neural Information Processing Systems, pp 262–270 (2015)

  31. Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., Vedaldi, A.: Describing textures in the wild. In: IEEE Conference in Computer Vision and Pattern Recognition (CVPR), pp 3606–3613 (2014)

  32. Cimpoi, M., Maji, S., Kokkinos, I., Vedaldi, A.: Deep filter banks for texture recognition, description, and segmentation. Inter. J. Comput. Vis. 118(1), 65–94 (2016)

    Article  MathSciNet  Google Scholar 

  33. Zhu, S., Ma, K.-K.: A new diamond search algorithm for fast block matching motion estimation. Image Process. IEEE Trans. 9(2), 287–290 (2000)

    Article  Google Scholar 

  34. Zhu, C., Byrd, R.H., Lu, P., Nocedal, J.: Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Softw. TOMS 23(4), 550C560 (1997)

    MathSciNet  MATH  Google Scholar 

  35. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on Multimedia. ACM, Orlando, Florida, USA. pp 675–678 (2014)

  36. Heeger, D.J., Bergen, J.R.: Pyramid-based texture analysis/synthesis. In: Proceedings of the 22nd annual conference on Computer graphics and interactive techniques. pp 229–238 (1995)

  37. Portilla, J., Simoncelli, P.: A parametric texture model based on joint statistics of complex wavelet coefficients. Int. J. Comput. Vis. 40(1), 49–71 (2000)

    Article  MATH  Google Scholar 

  38. Xie, X., Tian, F., Seah, H.S.: Feature guided texture synthesis (fgts) for artistic style transfer. In: Proceedings of the 2nd international conference on Digital interactive media in entertainment and arts. pp 44–49 (2007)

  39. Criminisi, A., Prez, P., Toyama, K.: Region filling and object removal by exemplar-based image inpainting. IEEE Trans. image Process. 13(9), 1200–1212 (2004)

    Article  Google Scholar 

  40. Hays, J., Efros, A.A.: Scene completion using millions of photographs. Commun. ACM 51(10), 87–94 (2008)

    Article  Google Scholar 

  41. Everingham, M., Eslami, S. A., Van Gool, L., Williams, C. K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: A retrospective. Int. J. comput. vis. 111(1), 98–136 (2015)

    Article  Google Scholar 

Download references

Acknowledgements

We thank the anonymous reviewers and the editor for their valuable comments. This work has been supported by The National Natural Science Foundation of China (Nos. 61772387 and 61372068), the Research Fund for the Doctoral Program of Higher Education of China (No. 20130203110005), the Fundamental Research Funds for the Central Universities (No. K5051301033), the 111 Project (No. B08038), and also supported by the ISN State Key Laboratory.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bin Song.

Additional information

Communicated by C. Xu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cai, X., Song, B. Semantic object removal with convolutional neural network feature-based inpainting approach. Multimedia Systems 24, 597–609 (2018). https://doi.org/10.1007/s00530-018-0585-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-018-0585-x

Keywords

Navigation