Local-Fusion Diffusion Model for Enhancing Few-Shot Image Generation

Hou, Jishuai; Luo, Lei; Yang, Jian

doi:10.1007/978-3-031-46305-1_22

Jishuai Hou¹⁴,
Lei Luo¹⁴ &
Jian Yang¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14355))

Included in the following conference series:

International Conference on Image and Graphics

770 Accesses

Abstract

In recent research, few-shot generation models have attracted increasing interest in computer vision. They aim at generating more data of a given domain, with only a few available training examples. Although many methods have been introduced to handle few-shot generation tasks, most of them are usually unstable during the training process and can only generate cookie-cutter images. To alleviate these issues, we propose a novel few-shot generation method based on the classifier-free conditional diffusion model. Specifically, we train an autoencoder on seen categories and then use patch discriminator adversarial training to achieve better reconstruction quality. Subsequently, for the k-shot task, we extract k image features and calculate the conditional information to guide the training generation of the diffusion model. To avoid the singularness of conditional information caused by the prototype model, we use the latest Feature Fusion module (LFM) to learn various features. We conduct extensive experiments on three well-known datasets and the experimental results clearly demonstrate the effectiveness of our proposed method for few-shot image generation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Conditional Distribution Modelling for Few-Shot Image Synthesis with Diffusion Models

Rethinking cross-domain semantic relation for few-shot image generation

Article 27 June 2023

AMMGAN: adaptive multi-scale modulation generative adversarial network for few-shot image generation

Article 06 May 2023

References

Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al.: Matching networks for one shot learning. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Google Scholar
Ding, G., et al.: Attribute group editing for reliable few-shot image generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, pp. 11194–11203 (2022)
Google Scholar
Clouâtre, L., Demers, M.: FIGR: Few-shot Image Generation with Reptile. arXiv preprint arXiv:1901.02199 (2019)
Gu, Z., Li, W., Huo, J., Wang, L., Gao, Y.: LoFGAN: fusing local representations for few-shot image generation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, CVPR 2021, pp. 8463–8471 (2021)
Google Scholar
Hong, Y., Niu, L., Zhang, J., Zhao, W., Fu, C., Zhang, L.: F2GAN: fusing-and-filling GAN for few-shot image generation. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 2535–2543. ACM Multimedia 2020 (2020)
Google Scholar
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, pp. 1126–1135. PMLR (2017)
Google Scholar
Bartunov, S., Vetrov, D.: Few-shot generative modelling with generative matching networks. In: International Conference on Artificial Intelligence and Statistics, pp. 670–678. PMLR (2018)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114 (2013)
Hong, Y., Niu, L., Zhang, J., Zhang, L.: MatchingGAN: matching-based few-shot image generation. In: 2020 IEEE International Conference on Multimedia and Expo (ICME), ICME 2020, pp. 1–6. IEEE (2020)
Google Scholar
Esser, P., Rombach, R., Ommer, B.: Taming transformers for high-resolution image synthesis. In: CVPR 2021 (2021)
Google Scholar
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Advances in Neural Information Processing Systems, NeurIPS2020, vol. 33, pp. 6840–6851. Curran Associates Inc (2020)
Google Scholar
Dhariwal, P., Nichol, A.: Diffusion models beat GANs on image synthesis. Adv. Neural. Inf. Process. Syst. 34, 8780–8794 (2021)
Google Scholar
Saharia, C., Ho, J., Chan, W., Salimans, T., Fleet, D.J., Norouzi, M.: Image super-resolution via iterative refinement IEEE Trans. Pattern Anal. Mach. Intell. (2022)
Google Scholar
Saharia, C., et al.: Palette: image-to-image diffusion models. In: ACM SIGGRAPH 2022 Conference Proceedings, pp. 1–10 (2022)
Google Scholar
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-Resolution Image Synthesis with Latent Diffusion Models, CVPR2022. arXiv:2112.10752 (2022)
Nichol, A., et al.: Glide: towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741 (2021)
Zhao, A., et al.: Domain-adaptive few-shot learning. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1390–1399 (2021)
Google Scholar
Robb, E., Chu, W.S., Kumar, A., Huang, J.B.: Few-shot adaptation of generative adversarial networks. arXiv preprint arXiv:2010.11943 (2020)
Ojha, U., et al.: Few-shot image generation via cross-domain correspondence. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10743–10752 (2021)
Google Scholar
Li, Y., Zhang, R., Lu, J., Shechtman, E.: Few-shot image generation with elastic weight consolidation. arXiv preprint arXiv:2012.02780 (2020)
Zhao, M., Cong, Y., Carin, L.: On leveraging pretrained GANs for generation with limited data. In: International Conference on Machine Learning, pp. 11340–11351. PMLR (2020)
Google Scholar
Zhao, Y., Ding, H., Huang, H., Cheung, N.M.: A closer look at few-shot image generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, GAN Adaptation, pp. 9140–9150 (2022)
Google Scholar
Hong, Y., Niu, L., Zhang, J., Zhang, L.: Few-shot Image Generation Using Discrete Content Representation. arXiv preprint arXiv:2207.10833. ACM MM (2022)
Zhu, J., Ma, H., Chen, J., Yuan, J.: Few-shot image generation with diffusion models. arXiv preprint arXiv:2211.03264 (2022)
Liang, W., Liu, Z., Liu, C.: Dawson: A domain adaptive few shot generation framework. arXiv preprint arXiv:2001.00576 (2020)
Nichol, A.Q., Dhariwal, P.: Improved denoising diffusion probabilistic models. In: ICML 2021 (2021)
Google Scholar
Wang, W., et al.: SinDiffusion: Learning a Diffusion Model from a Single Natural Image. arXiv:2211.12445 [cs] (2022)
Giannone, G., Nielsen, D., Winther, O.: Few-Shot Diffusion Models. Technical report arXiv:2205.15463 (2022)
Ho, J., Salimans, T.: Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598 (2022)
Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing, pp. 722–729. IEEE (2008)
Google Scholar
Liu, M.Y., et al.: Few-shot unsupervised image-to-image translation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10551–10560 (2019)
Google Scholar
Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: Vggface2: a dataset for recognising faces across pose and age. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 67–74. IEEE (2018)
Google Scholar
Antoniou, A., Storkey, A., Edwards, H.: Data augmentation generative adversarial networks. arXiv preprint arXiv:1711.04340 (2017)
Van Den Oord, A., Vinyals, O., et al.: Neural discrete representation learning. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Nanjing University of Science and Technology, Nanjing, China
Jishuai Hou, Lei Luo & Jian Yang

Authors

Jishuai Hou
View author publications
You can also search for this author in PubMed Google Scholar
Lei Luo
View author publications
You can also search for this author in PubMed Google Scholar
Jian Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian Yang .

Editor information

Editors and Affiliations

Dalian University of Technology, Dalian, China
Huchuan Lu
University of Sydney, Sydney, NSW, Australia
Wanli Ouyang
Shenzhen University, Shenzhen, China
Hui Huang
Tsinghua University, Beijing, China
Jiwen Lu
Dalian University of Technology, Dalian, China
Risheng Liu
Institute of Automation, CAS, Beijing, China
Jing Dong
University of Technology Sydney, Sydney, NSW, Australia
Min Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hou, J., Luo, L., Yang, J. (2023). Local-Fusion Diffusion Model for Enhancing Few-Shot Image Generation. In: Lu, H., et al. Image and Graphics. ICIG 2023. Lecture Notes in Computer Science, vol 14355. Springer, Cham. https://doi.org/10.1007/978-3-031-46305-1_22

Download citation

DOI: https://doi.org/10.1007/978-3-031-46305-1_22
Published: 29 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46304-4
Online ISBN: 978-3-031-46305-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Local-Fusion Diffusion Model for Enhancing Few-Shot Image Generation