The advent of one-step text-to-image (T2I) models offers unprecedented synthesis speed. However, their application to text-guided image editing remains severely hampered, as forcing existing training-free editors into a single inference step fails. This failure manifests as severe object distortion and a critical loss of consistency in non-edited regions, resulting from the high-energy, erratic trajectories produced by naive vector arithmetic on the models' structured fields. To address this problem, we introduce ChordEdit, a model agnostic, training-free, and inversion-free method that facilitates high-fidelity one-step editing. We recast editing as a transport problem between the source and target distributions defined by the source and target text prompts. Leveraging dynamic optimal transport theory, we derive a principled, low-energy control strategy. This strategy yields a smoothed, variance-reduced editing field that is inherently stable, facilitating the field to be traversed in a single, large integration step. A theoretically grounded and experimentally validated approach allows ChordEdit to deliver fast, lightweight and precise edits, finally achieving true real-time editing on these challenging models.
Existing training-free methods fail in one-step settings because the naive drift difference creates a high-energy, erratic trajectory. When integrated in a single step, this leads to significant error accumulation.
Our key contribution is the Chord Control Field, a theoretically-grounded, time-weighted average of the source and target drifts. This acts as a temporal smoothing operator, yielding an inherently stable, low-energy field.
We compare ChordEdit with state-of-the-art multi-step, few-step, and one-step editors. ChordEdit consistently adheres to the prompt with exceptional background preservation, avoiding artifacts or identity failures seen in other methods.
To validate the stability of our method, we visualize the energy of the editing fields. The Naive baseline exhibits high-energy spikes, leading to severe artifacts and background corruption. In contrast, ChordEdit derives a stable, low-energy field, resulting in high-fidelity edits that preserve object identity and non-edited regions.
@misc{lu2026chordedit,
title={ChordEdit: One-Step Low-Energy Transport for Image Editing},
author={Liangsi Lu and Xuhang Chen and Minzhe Guo and Shichu Li and Jingchao Wang and Yang Shi},
year={2026},
eprint={2602.19083},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2602.19083},
}