Abstract
As the big difference between natural type and art type, recognizing visual objects from photos to art paintings, cartoon pictures, or sketches introduces a great challenge. Domain adaptation focuses on overcoming the differences between different fields. It is an effective technology to bridge the cross-domain discrepancy by transferable features, while the existing domain-adaptive methods all need target domain images of the same category as source domain images to reduce domain shifts, which leads to limitations on target domain images. To solve this problem, we constructed an end-to-end unsupervised model called adaptive depiction fusion network (ADFN). Compared with other domain adaption methods, ADFN recognizes visual objects in art works by using only their natural type. It reinforces adaptive instance normalization technology to embed the depiction offset into the source domain features. At the meantime, we also provide a complete benchmark, cross-depiction-net, which is large and various enough to overcome the lack of data for this problem. To properly evaluate the performance of the ADFN, we compared it to different state-of-the-art methods (DAN, DDC, Deep-coral, and MRAN) on cross-depiction-net dataset. The results show that our model is superior to the state-of-the-art methods.









Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
References
Wu, Q., Cai, H., Hall, P.: Learning graphs to model visual objects across different depictive styles. In: European Conference on Computer Vision, pp. 313–328 (2014)
Long, M., Cao, Y., Wang, J., Jordan,M.I.: Learning transferable features with deep adaptation networks. In: International Conference on Machine Learning, pp. 97–105 (2015)
Tzeng, E.: Deep domain confusion: maximizing for domain invariance. arXiv preprint arXiv:1412.3474 (2014)
Sun, B., Saenko, K.: Deep coral: correlation alignment for deep domain adaptation. In: European Conference on Computer Vision, pp. 443–450 (2016)
Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H.: Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17(1), 12096–2030 (2016)
Bousmalis, K., Trigeorgis, G., Silberman, N., Krishnan, D., Erhan, D.: Domain separation networks. In: Conference and Workshop on Neural Information Processing Systems, pp. 343–351 (2016)
Cao, Z., Long, M., Wang, J., Jordan, M.I.: Partial transfer learning with selective adversarial networks. In: Computer Vision and Pattern Recognition, pp. 2724–2732 ( 2018)
Li, J.: Cross-depiction problem: recognition and synthesis of photographs and artwork. Comput. Vis. Media 1(2), 91–103 (2015)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Hu, R., Collomosse, J.: A performance evaluation of gradient field HOG descriptor for sketch based image retrieval. Comput. Vis. Image Underst. 117(7), 790–806 (2013)
Wu, Q., Cai, H., Hall, P.: Learning graphs to model visual objects across different depictive styles. Lect. Notes Comput. Sci. 7, 313–328 (2014)
Crowley, E.J., Zisserman, A.: The art of detection. In: European Conference on Computer Vision, pp. 721–737 (2016)
Florea, C., Badea, M., Florea, L., Vertan, C.: Domain transfer for delving into deep networks capacity to de-abstract art. In: Scandinavian Conference on Image Analysis, pp. 337–349 (2017)
Peng, X., Usman, B., Saito, K., Kaushik, N., Hoffman, J., Saenko, K.: Syn2real: a new benchmark forsynthetic-to-real visual domain adaptation. arXiv preprint arXiv:1806.09755 (2018)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Conference and Workshop on Neural Information Processing Systems, pp. 1106–1114 (2012)
Deng, W., Zheng, L., Ye, Q., Kang, G., Yang, Y., Jiao, J.: Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 994–1003 (2018)
Zhu, Y., Zhuang, F., Wang, J., Chen, J., Shi, Z., Wu, W.: Multi-representation adaptation network for cross-domain image classification. Neural Netw. 119, 214–221 (2019)
Lee, C.Y., Batra, T., Baig, M. H., Ulbricht, D.: Sliced wasserstein discrepancy for unsupervised domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10285–10295 (2019)
Zhang, Y., Tang, H., Jia, K., Tan, M.: Domain-symmetric networks for adversarial domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5031–5040 (2019)
Gatys, L. A., Ecker, A. S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)
Li, C., Wand, M.: Combining Markov random fields and convolutional neural networks for image synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2479–2486 (2016)
Li, J.: Visual attribute transfer through deep image analogy. ACM Trans. Graph. 36(4), 120:1–120:15 (2017)
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: IEEE International Conference on Computer Vision, pp. 1501–1510 (2017)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Conference and Workshop on Neural Information Processing Systems, pp. 2672–2680 (2014)
Taigman, Y., Polyak, A., Wolf, L.: Unsupervised crossdomain image generation. arXiv preprint arXiv:1611.02200 (2014)
Liu, M. Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. arXiv preprint arXiv:1703.00848 (2017)
Kim, T., Cha, M., Kim, H., Lee,J., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. arXiv preprint arXiv:1703.05192 (2017)
Li, D., Yang,Y., Song, Y. Z.: Deeper, broader and artier domain generalization. In: IEEE International Conference on Computer Vision, pp. 5542–5550 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE International Conference on Computer Vision, pp. 770–778 (2016)
Deng, J., Dong, W., Socher, R.: Imagenet: a large-scale hierarchical image database. In: IEEE International Conference on Computer Vision, pp. 248–255 (2009)
Bai, T., Wang, C., Wang, Y., Huang, L., Xing, F.: A novel deep learning method for extracting unspecific biomedical relation. Concurrency Comput. Pract. Exp. 32(1), e5005 (2020)
Wang, Y., Huang, L., Guo, S., Gong, L., Bai, T.: A novel MEDLINE topic indexing method using image presentation. J. Vis. Commun. Image Represent. 58, 130–137 (2019)
Yang, H., Min, K.: Classification of basic artistic media based on a deep convolutional approach. Vis. Comput. 36, 559–578 (2020)
Zhou, F., Hu, Y., Shen, X.: MSANet: multimodal self-augmentation and adversarial network for RGB-D object recognition. Vis. Comput. 35, 1583–1594 (2019)
Bai, T., Gong, L., Wang, Y.: A method for exploring implicit concept relatedness in biomedical knowledge network. BMC Bioinform. 17, 53–66 (2016)
Wang, L., Wang, Z., Yang, X.: Photographic style transfer. Vis. Comput. 36, 317–331 (2020)
Zhao, H., Rosin, P.L., Lai, Y.K.: Automatic semantic style transfer using deep convolutional neural networks and soft masks. Vis. Comput. 36, 1307–1324 (2020)
Acknowledgements
This work is supported by the National Natural Science Foundation of China (No. 61702214), Development Project of Jilin Province of China (No. 202008 01033GH), Jilin Provincial Key Laboratory of Big Date Intelligent Computing (No. 20180622002JC), and the Fundamental Research Funds for the Central University, JLU.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Huang, L., Wang, Y. & Bai, T. Recognizing art work image from natural type: a deep adaptive depiction fusion method. Vis Comput 37, 1221–1232 (2021). https://doi.org/10.1007/s00371-020-01995-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-020-01995-2