Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Drag-Based Image Editing (DBIE), which allows users to manipulate images by directly dragging objects within them, has recently attracted much attention from the community. However, it faces two key challenges: (\emph{i}) point-based drag is often highly ambiguous and difficult to align with user intentions; (\emph{ii}) current DBIE methods primarily rely on alternating between motion supervision and point tracking, which is not only cumbersome but also fails to produce high-quality results. These limitations motivate us to explore DBIE from a new perspective---unifying it as a Latent Region Optimization (LRO) problem that aims to use region-level geometric transformations to optimize latent code to realize drag manipulation. Thus, by specifying the areas and types of geometric transformations, we can effectively address the ambiguity issue. We also propose a simple yet effective editing framework, dubbed \textbf{DragNeXt}. It solves LRO through Progressive Backward Self-Intervention (PBSI), simplifying the overall procedure of the alternating workflow while further enhancing quality by fully leveraging region-level structure information and progressive guidance from intermediate drag states. We validate \textbf{DragNeXt} on our NextBench, and extensive experiments demonstrate that our proposed method can significantly outperform existing approaches. Code will be released on~github.