skip to main content
10.1145/3581783.3612680acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
demonstration

EditAnything: Empowering Unparalleled Flexibility in Image Editing and Generation

Published:27 October 2023Publication History

ABSTRACT

Image editing plays a vital role in computer vision field, aiming to realistically manipulate images while ensuring seamless integration. It finds numerous applications across various fields. In this work, we present EditAnything, a novel approach that empowers users with unparalleled flexibility in editing and generating image content. EditAnything introduces an array of advanced features, including cross-image dragging (e.g., try-on), region-interactive editing, controllable layout generation, and virtual character replacement. By harnessing these capabilities, users can engage in interactive and flexible editing, giving captivating outcomes that uphold the integrity of the original image. With its diverse range of tools, EditAnything caters to a wide spectrum of editing needs, pushing the boundaries of image editing and unlocking exciting new possibilities. The source code is released at https://github.com/sail-sg/EditAnything.

References

  1. Shanghua Gao, Pan Zhou, Ming-Ming Cheng, and Shuicheng Yan. 2023. Masked diffusion transformer is a strong image synthesizer. In International Conference on Computer Vision.Google ScholarGoogle ScholarCross RefCross Ref
  2. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Commun. ACM, Vol. 63, 11 (2020), 139--144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jonathan Ho and Tim Salimans. 2022. Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598 (2022).Google ScholarGoogle Scholar
  4. Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. 2023. Segment anything. arXiv preprint arXiv:2304.02643 (2023).Google ScholarGoogle Scholar
  5. Lorenzo Luzi, Ali Siahkoohi, Paul M Mayer, Josue Casco-Rodriguez, and Richard Baraniuk. 2022. Boomerang: Local sampling on image manifolds using diffusion models. arXiv preprint arXiv:2210.12100 (2022).Google ScholarGoogle Scholar
  6. Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. 2022. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125 (2022).Google ScholarGoogle Scholar
  7. Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10684--10695.Google ScholarGoogle ScholarCross RefCross Ref
  8. Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L Denton, Kamyar Ghasemipour, Raphael Gontijo Lopes, Burcu Karagol Ayan, Tim Salimans, et al. 2022. Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, Vol. 35 (2022), 36479--36494.Google ScholarGoogle Scholar
  9. Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning. PMLR, 2256--2265.Google ScholarGoogle Scholar
  10. Yang Song and Stefano Ermon. 2019. Generative modeling by estimating gradients of the data distribution. Advances in neural information processing systems, Vol. 32 (2019).Google ScholarGoogle Scholar
  11. Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. 2021. Score-Based Generative Modeling through Stochastic Differential Equations. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  12. Zike Wu, Pan Zhou, Kenji Kawaguchi, and Hanwang Zhang. 2023. Fast Diffusion Model. arxiv: 2306.06991 [cs.CV]Google ScholarGoogle Scholar
  13. Lvmin Zhang and Maneesh Agrawala. 2023. Adding conditional control to text-to-image diffusion models. arXiv preprint arXiv:2302.05543 (2023).Google ScholarGoogle Scholar

Index Terms

  1. EditAnything: Empowering Unparalleled Flexibility in Image Editing and Generation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM '23: Proceedings of the 31st ACM International Conference on Multimedia
      October 2023
      9913 pages
      ISBN:9798400701085
      DOI:10.1145/3581783

      Copyright © 2023 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 October 2023

      Check for updates

      Qualifiers

      • demonstration

      Acceptance Rates

      Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia
    • Article Metrics

      • Downloads (Last 12 months)295
      • Downloads (Last 6 weeks)50

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader