Abstract
In this paper, we regard edit propagation as a multi-class classification problem and deep neural network (DNN) is used to solve the problem. We design a shallow and fully convolutional DNN that can be trained end-to-end. To achieve this, our method uses combinations of low-level visual features, which are extracted from the input image, and spatial features, which are computed through transforming user interactions, as input of the DNN, which efficiently performs a joint learning of visual and spatial features. We then train the DNN on many of such combinations in order to build a DNN-based pixel-level classifier. Our DNN is also equipped with patch-by-patch training and whole image estimation, speeding up learning and inference. Finally, we improve classification accuracy of the DNN by employing a fully connected conditional random field. Experimental results show that our method can respond to user interactions well and generate precise results compared with the state-of-art edit propagation approaches. Furthermore, we demonstrate our method on various applications.
Similar content being viewed by others
References
Levin, A., Lischinski, D., Weiss, Y.: Colorization using optimization. ACM Trans. Graph. 23(3), 689–694 (2004)
Yatziv, L., Sapiro, G.: Fast image and video colorization using chrominance blending. IEEE Trans. Image Process. 15(5), 1120–1129 (2006)
Qu, Y., Wong, T.T., Heng, P.A.: Manga colorization. ACM Trans. Graph. 25(3), 1214–1220 (2006)
Luan, Q., Wen, F., Cohen-Or, D., Liang, L., Xu, Y.Q., Shum, H.Y.: Natural image colorization. In: Proceedings of the Eurographics Symposium on Rendering Techniques, pp. 309–320 (2007)
Lischinski, D., Farbman, Z., Uyttendaele, M., Szeliski, R.: Interactive local adjustment of tonal values. ACM Trans. Graph. 25(3), 646–653 (2006)
Pellacini, F., Lawrence, J.: AppWand: editing measured materials using appearance-driven optimization. ACM Trans. Graph. 26(3), 54 (2007)
An, X., Pellacini, F.: AppProp: all-pairs appearance-space edit propagation. ACM Trans. Graph. 27(3), 40:1–40:9 (2008)
Xu, K., Li, Y., Ju, T., Hu, S.M., Liu, T.Q.: Efficient affinity-based edit propagation using K-D tree. ACM Trans. Graph. 28(5), 118:1–118:6 (2009)
Li, Y., Ju, T., Hu, S.M.: Instant propagation of sparse edits on images and videos. Comput. Graph. Forum 29(7), 2049–2054 (2010)
Bie, X., Huang, H., Wang, W.: Real time edit propagation by efficient sampling. Comput. Graph. Forum 30(7), 2041–2048 (2011)
Xiao, C., Nie, Y., Tang, F.: Efficient edit propagation using hierarchical data structure. IEEE Trans. Vis. Comput. Graph. 17(8), 1135–1147 (2011)
Criminisi, A., Sharp, T., Rother, C., Perez, P.: Geodesic image and video editing. ACM Trans. Graph. 29(5), 134:1–134:15 (2010)
Farbman, Z., Fattal, R., Lischinski, D.: Diffusion maps for edge-aware image editing. ACM Trans. Graph. 29(6), 145:1–145:10 (2010)
Ma, L.Q., Xu, K.: Efficient antialiased edit propagation for images and videos. Comput. Graph. 36(8), 1005–1012 (2012)
Chen, X., Zou, D., Zhao, Q., Tan, P.: Manifold preserving edit propagation. ACM Trans. Graph. 31(6), 132:1–132:7 (2012)
Musialski, P., Cui, M., Ye, J.P., Razdan, A., Wonka, P.: A framework for interactive image color editing. Vis. Comput. 39(11), 1173–1186 (2013)
Xu, L., Yan, Q., Jia, J.Y.: A sparse control model for image and video editing. ACM Trans. Graph. 32(6), 197:1–197:10 (2013)
Yatagawa, T., Yamaguchi, Y.: Sparse pixel sampling for appearance edit propagation. Vis. Comput. 31, 1101–1111 (2015)
Li, Y., Adelson, E., Agarwala, A.: Scribbleboost: adding classification to edge-aware interpolation of local image and video adjustments. EGSR 08, 1255–1264 (2008)
Dalmau, O., Rivera, M., Alarcon, T.: Bayesian scheme for interactive colourization, recolourization and image/video editing. Comput. Graph. Forum 29(8), 2372–2386 (2010)
Chen, X., Zou, D., Li, J., Cao, X., Zhao, Q., Zhang, H.: Sparse dictionary learning for edit propagation of high-resolution images. CVPR 2014, 2854–2861 (2014)
Levin, A., Lischinski, D., Weiss, Y.: A closed form solution to natural image matting. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 228–242 (2008)
Cho, H., Lee, H., Kang, H., Lee, S.: Bilateral texture filtering. ACM Trans. Graph. 33(4), 128:1–128:8 (2014)
Cambra, A.B., Murillo, A.C., Munõz, A.: A generic tool for interactive complex image editing. Vis. Comput. 34, 1493–1505 (2017)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1106–1114 (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR arXiv:1409.1556 (2014)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)
Ren, S., He, K.M., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2016)
Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. 36(4), 1–14 (2017)
Endo, Y., Iizuka, S., Kanamori, Y., Mitani, J.: DeepProp: extracting deep features from a single image for edit propagation. Comput. Graph. Forum 35, 189–201 (2016)
Yan, Z., Zhang, H., Wang, B., Paris, S., Yu, Y.: Automatic photo adjustment using deep neural networks. ACM Trans. Graph. 35(2), 11:1–11:15 (2016)
Xu, N., Price, B.L., Cohen, S., Yang, J., Huang, T.S.: Deep interactive object selection. In: CVPR, pp. 373–381 (2016)
Zhang, R., Zhu, J.Y., Isola, P., Geng, X.Y., Lin, A.S., Yu, T., Efros, A.A.: Real-time user-guided image colorization with learned deep priors. ACM Trans. Graph. 36(4), 119:1–119:11 (2017)
Kingma, D. P., Ba, J.: Adam: a method for stochastic optimization. CoRR arXiv:1412.6980 (2014)
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)
Krähenbühl P., Koltun V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: NIPS, pp. 109–117 (2011)
Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: CVPR, pp. 1597–1604 (2009)
Lin, T., Maire, M., Belongie, S.J., Hays, J., Perona, P., Ramanan, D., Dollar, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: ECCV, pp. 740–755 (2014)
Gui, Y., Zeng, G., Tang, W.: Fast and robust image cutout using bilateral grid and confidence based color model. J. Comput. Aided Des. Comput. Graph. 30(7), 1284–1296 (2018). (in Chinese)
Acknowledgements
We would like to thank Prof. Yiyu Cai and Dr. Zhifeng Xie for proofreading our paper. We would also like to thank the reviewers for their valuable comments. This study was funded by the National Natural Science Foundations of P. R. China (Grant Nos. 61402053; 61602059; 61772087; 61802031) and the Scientific Research Fund of Education Department of Hunan Province (Grant Nos. 16C0046; 16A008).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Gui, Y., Zeng, G. Joint learning of visual and spatial features for edit propagation from a single image. Vis Comput 36, 469–482 (2020). https://doi.org/10.1007/s00371-019-01633-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-019-01633-6