Papers
arxiv:2504.02828

Concept Lancet: Image Editing with Compositional Representation Transplant

Published on Apr 3
· Submitted by peterljq on Apr 8
Authors:
,
,

Abstract

Diffusion models are widely used for image editing tasks. Existing editing methods often design a representation manipulation procedure by curating an edit direction in the text embedding or score space. However, such a procedure faces a key challenge: overestimating the edit strength harms visual consistency while underestimating it fails the editing task. Notably, each source image may require a different editing strength, and it is costly to search for an appropriate strength via trial-and-error. To address this challenge, we propose Concept Lancet (CoLan), a zero-shot plug-and-play framework for principled representation manipulation in diffusion-based image editing. At inference time, we decompose the source input in the latent (text embedding or diffusion score) space as a sparse linear combination of the representations of the collected visual concepts. This allows us to accurately estimate the presence of concepts in each image, which informs the edit. Based on the editing task (replace/add/remove), we perform a customized concept transplant process to impose the corresponding editing direction. To sufficiently model the concept space, we curate a conceptual representation dataset, CoLan-150K, which contains diverse descriptions and scenarios of visual terms and phrases for the latent dictionary. Experiments on multiple diffusion-based image editing baselines show that methods equipped with CoLan achieve state-of-the-art performance in editing effectiveness and consistency preservation.

Community

Paper author Paper submitter

[CVPR 2025] Concept Lancet: Image Editing with Compositional Representation Transplant
[Project Page] [ArXiv]
Editing images with diffusion models often requires calibration of the edit strength. It is ineffective to guess how much to walk along the editing direction — too much breaks the image, too little misses the edit. How can we impose new visual concepts precisely, without compromising what we want to keep?
Concept Lancet (CoLan) is a zero-shot, plug-and-play framework that enables principled image editing via sparse decomposition in the latent space. By modeling each image as a combination of visual concepts from the curated CoLan-150K dataset, CoLan accurately identifies and transplants concepts to perform edits that are both precise and visually consistent.

Very promising.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2504.02828 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2504.02828 in a Space README.md to link it from this page.

Collections including this paper 5