Skip to main content

ADORE: Adaptive Diffusion Optimized Restoration for AI-Generated Facial Imagery

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Abstract

We introduce ADORE (Adaptive Diffusion Optimized Restoration), a pioneering solution that addresses facial distortion issues in diffusion-based, language-guided image generation. ADORE enhances facial quality based on image characteristics and style, improving the visual fidelity of AI-generated images. It also mitigates boundary distortions during the face-background fusion process, offering a novel approach to address instability issues by using generative models for image restoration. Rigorous experiments validate ADORE’s proficiency in achieving high-quality, style-consistent facial restorations. ADORE supports text-driven, fine-tuned facial refinement, leveraging the model’s open-domain synthesis capability. As the first method tailored to enhance facial generation quality in text-to-image models, with its versatility and innovative solutions, ADORE successfully addresses a pressing issue and paves new avenues in image generation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://huggingface.co/runwayml/stable-diffusion-v1-5.

  2. 2.

    https://civitai.com/models/6424/chilloutmix

    https://civitai.com/models/3627/protogen-v22-anime-official-release

    https://civitai.com/models/4201/realistic-vision-v20.

References

  1. Cao, Y., et al.: A comprehensive survey of AI-generated content (AIGC): a history of generative AI from GAN to ChatGPT. arXiv preprint arXiv:2303.04226 (2023)

  2. Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Adv. Neural. Inf. Process. Syst. 33, 6840–6851 (2020)

    Google Scholar 

  3. Kim, K., et al.: DiffFace: diffusion-based face swapping with facial guidance. arXiv preprint arXiv:2212.13344 (2022)

  4. Li, J., Li, D., Savarese, S., Hoi, S.: BLIP-2: bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv preprint arXiv:2301.12597 (2023)

  5. Lu, C., Zhou, Y., Bao, F., Chen, J., Li, C., Zhu, J.: DPM-Solver: a fast ode solver for diffusion probabilistic model sampling in around 10 steps. arXiv preprint arXiv:2206.00927 (2022)

  6. Lu, C., Zhou, Y., Bao, F., Chen, J., Li, C., Zhu, J.: DPM-Solver++: fast solver for guided sampling of diffusion probabilistic models. arXiv preprint arXiv:2211.01095 (2022)

  7. Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Van Gool, L.: Repaint: inpainting using denoising diffusion probabilistic models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11461–11471 (2022)

    Google Scholar 

  8. Peng, Y., Zhao, C., Xie, H., Fukusato, T., Miyata, K.: DiffFaceSketch: high-fidelity face image synthesis with sketch-guided latent diffusion model. arXiv preprint arXiv:2302.06908 (2023)

  9. Qiu, X., Han, C., Zhang, Z., Li, B., Guo, T., Nie, X.: DiffBFR: bootstrapping diffusion model towards blind face restoration. arXiv preprint arXiv:2305.04517 (2023)

  10. Radford, A., et al.: Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning, pp. 8748–8763. PMLR (2021)

    Google Scholar 

  11. Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., Chen, M.: Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125 (2022)

  12. Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10684–10695 (2022)

    Google Scholar 

  13. Saharia, C., et al.: Photorealistic text-to-image diffusion models with deep language understanding. Adv. Neural. Inf. Process. Syst. 35, 36479–36494 (2022)

    Google Scholar 

  14. Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502 (2020)

  15. Wang, T., et al.: A survey of deep face restoration: denoise, super-resolution, deblur, artifact removal. arXiv preprint arXiv:2211.02831 (2022)

  16. Wang, X., Li, Y., Zhang, H., Shan, Y.: Towards real-world blind face restoration with generative facial prior. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9168–9178 (2021)

    Google Scholar 

  17. Wang, X., Xie, L., Dong, C., Shan, Y.: Real-ESRGAN: training real-world blind super-resolution with pure synthetic data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1905–1914 (2021)

    Google Scholar 

  18. Yang, L., et al.: Diffusion models: a comprehensive survey of methods and applications. arXiv preprint arXiv:2209.00796 (2022)

  19. Yang, S., Luo, P., Loy, C.C., Tang, X.: Wider Face: a face detection benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5525–5533 (2016)

    Google Scholar 

  20. Yang, T., Ren, P., Xie, X., Zhang, L.: Gan prior embedded network for blind face restoration in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 672–681 (2021)

    Google Scholar 

  21. Yu, J., et al.: Scaling autoregressive models for content-rich text-to-image generation. arXiv preprint arXiv:2206.10789 (2022)

  22. Yue, Z., Loy, C.C.: DifFace: blind face restoration with diffused error contraction. arXiv preprint arXiv:2212.06512 (2022)

  23. Zhou, S., Chan, K., Li, C., Loy, C.C.: Towards robust blind face restoration with codebook lookup transformer. Adv. Neural. Inf. Process. Syst. 35, 30599–30611 (2022)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the project “Digital Twin Application Demonstration for New Museum Public Service Models”, a key research topic under the National Key Research and Development Program of China, Grant No. 2022YFF0904305.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hong Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, J., Chen, H., Qi, G. (2024). ADORE: Adaptive Diffusion Optimized Restoration for AI-Generated Facial Imagery. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14435. Springer, Singapore. https://doi.org/10.1007/978-981-99-8552-4_29

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8552-4_29

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8551-7

  • Online ISBN: 978-981-99-8552-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics