ManiFest: Manifold Deformation for Few-Shot Image Translation

Pizzati, Fabio; Lalonde, Jean-François; de Charette, Raoul

doi:10.1007/978-3-031-19790-1_27

Fabio Pizzati^12,13,
Jean-François Lalonde¹⁴ &
Raoul de Charette¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13677))

Included in the following conference series:

European Conference on Computer Vision

3397 Accesses
5 Citations

Abstract

Most image-to-image translation methods require a large number of training images, which restricts their applicability. We instead propose ManiFest: a framework for few-shot image translation that learns a context-aware representation of a target domain from a few images only. To enforce feature consistency, our framework learns a style manifold between source and additional anchor domains (assumed to be composed of large numbers of images). The learned manifold is interpolated and deformed towards the few-shot target domain via patch-based adversarial and feature statistics alignment losses. All of these components are trained simultaneously during a single end-to-end loop. In addition to the general few-shot translation task, our approach can alternatively be conditioned on a single exemplar image to reproduce its specific style. Extensive experiments demonstrate the efficacy of ManiFest on multiple tasks, outperforming the state-of-the-art on all metrics. Our code is avaliable at https://github.com/cv-rits/ManiFest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder

Disentangling latent space better for few-shot image-to-image translation

Article 04 May 2022

Truly Unsupervised Image-to-Image Translation with Contrastive Representation Learning

Notes

1.
ACDC does not provide annotated daytime clear weather sequences.
2.
For FUNIT [29] and COCO-FUNIT [44], we modify hyperparameters per authors suggestions to adapt to the ACDC and Dark Zurich datasets.

References

Abid, M.A., Hedhli, I., Lalonde, J.F., Gagne, C.: Image-to-image translation with low resolution conditioning. arXiv (2021)
Google Scholar
Cao, J., Hou, L., Yang, M.H., He, R., Sun, Z.: Remix: towards image-to-image translation with limited data. In: CVPR (2021)
Google Scholar
Chen, W.Y., Liu, Y.C., Kira, Z., Wang, Y.C.F., Huang, J.B.: A closer look at few-shot classification. In: ICLR (2019)
Google Scholar
Cherian, A., Sullivan, A.: SEM-GAN: semantically-consistent image-to-image translation. In: WACV (2019)
Google Scholar
Choi, Y., Uh, Y., Yoo, J., Ha, J.W.: Stargan v2: diverse image synthesis for multiple domains. In: CVPR (2020)
Google Scholar
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: CVPR (2016)
Google Scholar
Dell’Eva, A., Pizzati, F., Bertozzi, M., de Charette, R.: Leveraging local domains for image-to-image translation. In: VISAPP (2022)
Google Scholar
Endo, Y., Kanamori, Y.: Few-shot semantic image synthesis using stylegan prior. CoRR (2021)
Google Scholar
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: CVPR (2016)
Google Scholar
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: NeurIPS (2017)
Google Scholar
Hu, H., Wang, W., Zhou, W., Zhao, W., Li, H.: Model-aware gesture-to-gesture translation. In: CVPR (2021)
Google Scholar
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: ICCV (2017)
Google Scholar
Huang, X., Liu, M.Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: ECCV (2018)
Google Scholar
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)
Google Scholar
Jia, Z., et al.: Semantically robust unpaired image translation for data with unmatched semantics statistics. In: ICCV (2021)
Google Scholar
Jiang, L., Zhang, C., Huang, M., Liu, C., Shi, J., Loy, C.C.: TSIT: a simple and versatile framework for image-to-image translation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 206–222. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_13
Chapter Google Scholar
Karras, T., Aittala, M., Hellsten, J., Laine, S., Lehtinen, J., Aila, T.: Training generative adversarial networks with limited data. In: NeurIPS (2020)
Google Scholar
Lee, H.-Y.: DRIT++: diverse image-to-image translation via disentangled representations. Int. J. Comput. Vis. 128(10), 2402–2417 (2020). https://doi.org/10.1007/s11263-019-01284-z
Article Google Scholar
Li, P., Yu, X., Yang, Y.: Super-resolving cross-domain face miniatures by peeking at one-shot exemplar. In: ICCV (2021)
Google Scholar
Li, P., Liang, X., Jia, D., Xing, E.P.: Semantic-aware grad-gan for virtual-to-real urban scene adaption. In: BMVC (2018)
Google Scholar
Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., Yang, M.H.: Universal style transfer via feature transforms. In: NeurIPS (2017)
Google Scholar
Li, Y., Liu, M.Y., Li, X., Yang, M.H., Kautz, J.: A closed-form solution to photorealistic image stylization. In: ECCV (2018)
Google Scholar
Li, Y., Zhang, R., Lu, J., Shechtman, E.: Few-shot image generation with elastic weight consolidation. In: NeurIPS (2020)
Google Scholar
Li, Y., Yuan, L., Vasconcelos, N.: Bidirectional learning for domain adaptation of semantic segmentation. In: CVPR (2019)
Google Scholar
Lin, C.T., Wu, Y.Y., Hsu, P.H., Lai, S.H.: Multimodal structure-consistent image-to-image translation. In: AAAI (2020)
Google Scholar
Lin, J., Wang, Y., He, T., Chen, Z.: Learning to transfer: Unsupervised meta domain translation. In: AAAI (2020)
Google Scholar
Lin, J., Xia, Y., Liu, S., Zhao, S., Chen, Z.: Zstgan: an adversarial approach for unsupervised zero-shot image-to-image translation. Neurocomputing 461, 327–335 (2021)
Article Google Scholar
Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: NeurIPS (2017)
Google Scholar
Liu, M.Y., et al.: Few-shot unsupervised image-to-image translation. In: ICCV (2019)
Google Scholar
Luan, F., Paris, S., Shechtman, E., Bala, K.: Deep photo style transfer. In: CVPR (2017)
Google Scholar
Ma, L., Jia, X., Georgoulis, S., Tuytelaars, T., Van Gool, L.: Exemplar guided unsupervised image-to-image translation with semantic consistency. In: ICLR (2019)
Google Scholar
Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: ICCV (2017)
Google Scholar
Mo, S., Cho, M., Shin, J.: Instagan: instance-aware image-to-image translation. In: ICLR (2019)
Google Scholar
Ojha, U., et al.: Few-shot image generation via cross-domain correspondence. In: CVPR (2021)
Google Scholar
Park, T., Efros, A.A., Zhang, R., Zhu, J.-Y.: Contrastive learning for unpaired image-to-image translation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12354, pp. 319–345. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58545-7_19
Chapter Google Scholar
Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: CVPR (2019)
Google Scholar
Park, T., et al.: Swapping autoencoder for deep image manipulation. In: NeurIPS (2020)
Google Scholar
Patashnik, O., Danon, D., Zhang, H., Cohen-Or, D.: Balagan: cross-modal image translation between imbalanced domains. In: CVPR Workshops (2021)
Google Scholar
Pizzati, F., Cerri, P., de Charette, R.: Model-based occlusion disentanglement for image-to-image translation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12365, pp. 447–463. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58565-5_27
Chapter Google Scholar
Pizzati, F., Cerri, P., de Charette, R.: CoMoGAN: continuous model-guided image-to-image translation. In: CVPR (2021)
Google Scholar
Pizzati, F., Cerri, P., de Charette, R.: Guided disentanglement in generative networks. arXiv (2021)
Google Scholar
Ramirez, P.Z., Tonioni, A., Di Stefano, L.: Exploiting semantics in adversarial training for image-level domain adaptation. In: IPAS (2018)
Google Scholar
Richter, S.R., Hayder, Z., Koltun, V.: Playing for benchmarks. In: ICCV (2017)
Google Scholar
Saito, K., Saenko, K., Liu, M.-Y.: COCO-FUNIT: few-shot unsupervised image translation with a content conditioned style encoder. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 382–398. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_23
Chapter Google Scholar
Sakaridis, C., Dai, D., Van Gool, L.: Map-guided curriculum domain adaptation and uncertainty-aware evaluation for semantic nighttime image segmentation. In: T-PAMI (2020)
Google Scholar
Sakaridis, C., Dai, D., Van Gool, L.: ACDC: the adverse conditions dataset with correspondences for semantic driving scene understanding. In: ICCV (2021)
Google Scholar
Sun, P., et al.: Scalability in perception for autonomous driving: Waymo open dataset. In: CVPR (2020)
Google Scholar
Tang, H., Xu, D., Yan, Y., Corso, J.J., Torr, P.H., Sebe, N.: Multi-channel attention selection gans for guided image-to-image translation. In: CVPR (2019)
Google Scholar
Torrey, L., Shavlik, J.: Transfer learning. In: Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques. IGI Global (2010)
Google Scholar
Tremblay, M., Halder, S.S., de Charette, R., Lalonde, J.-F.: Rain rendering for evaluating and improving robustness to bad weather. Int. J. Comput. Vis. 129, 1–20 (2020). https://doi.org/10.1007/s11263-020-01366-3
Article Google Scholar
Wang, J., et al.: Deep high-resolution representation learning for visual recognition. In: T-PAMI (2019)
Google Scholar
Wang, X., Yu, K., Dong, C., Tang, X., Loy, C.C.: Deep network interpolation for continuous imagery effect transition. In: CVPR (2019)
Google Scholar
Wang, Y., Khan, S., Gonzalez-Garcia, A., Weijer, J.V.D., Khan, F.S.: Semi-supervised learning for few-shot image-to-image translation. In: CVPR (2020)
Google Scholar
Wang, Y., Mantecon, H.L., Lopez-Fuentes, J.V.D.W., Raducanu, B.: Transferi2i: transfer learning for image-to-image translation from small datasets. In: ICCV (2021)
Google Scholar
Wu, W., Cao, K., Li, C., Qian, C., Loy, C.C.: Transgaga: geometry-aware unsupervised image-to-image translation. In: CVPR (2019)
Google Scholar
Yoo, J., Uh, Y., Chun, S., Kang, B., Ha, J.W.: Photorealistic style transfer via wavelet transforms. In: ICCV (2019)
Google Scholar
Yu, X., Chen, Y., Liu, S., Li, T., Li, G.: Multi-mapping image-to-image translation via learning disentanglement. In: NeurIPS (2019)
Google Scholar
Zhan, F., et al.: Unbalanced feature transport for exemplar-based image translation. In: CVPR (2021)
Google Scholar
Zhang, P., Zhang, B., Chen, D., Yuan, L., Wen, F.: Cross-domain correspondence learning for exemplar-based image translation. In: CVPR (2020)
Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018)
Google Scholar
Zhao, S., Liu, Z., Lin, J., Zhu, J.Y., Han, S.: Differentiable augmentation for data-efficient gan training. In: NeurIPS (2020)
Google Scholar
Zheng, C., Cham, T.J., Cai, J.: The spatially-correlative loss for various image translation tasks. In: CVPR (2021)
Google Scholar
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: CVPR (2017)
Google Scholar
Zhu, J.Y., et al.: Toward multimodal image-to-image translation. In: NeurIPS (2017)
Google Scholar
Zhu, P., Abdal, R., Qin, Y., Wonka, P.: Sean: image synthesis with semantic region-adaptive normalization. In: CVPR (2020)
Google Scholar
Zhu, Z., Xu, Z., You, A., Bai, X.: Semantically multi-modal image synthesis. In: CVPR (2020)
Google Scholar

Download references

Acknowledgements

This work was partly funded by Vislab Ambarella, the French project SIGHT (ANR-20-CE23-0016), and received support from Service de coopération et d’action culturelle du Consulat général de France à Québec. It used HPC resources from GENCI-IDRIS (Grant 2021-AD011012808).

Author information

Authors and Affiliations

Inria, Paris, France
Fabio Pizzati & Raoul de Charette
VisLab, Parma, Italy
Fabio Pizzati
Université Laval, Québec, Canada
Jean-François Lalonde

Authors

Fabio Pizzati
View author publications
You can also search for this author in PubMed Google Scholar
Jean-François Lalonde
View author publications
You can also search for this author in PubMed Google Scholar
Raoul de Charette
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fabio Pizzati .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 15399 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pizzati, F., Lalonde, JF., de Charette, R. (2022). ManiFest: Manifold Deformation for Few-Shot Image Translation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13677. Springer, Cham. https://doi.org/10.1007/978-3-031-19790-1_27

Download citation

DOI: https://doi.org/10.1007/978-3-031-19790-1_27
Published: 24 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19789-5
Online ISBN: 978-3-031-19790-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

ManiFest: Manifold Deformation for Few-Shot Image Translation