Content not yet available
This lecture has no active video or poster.
Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
In industrial anomaly detection, the scarcity of diverse defective samples poses a major challenge to training robust and scalable models. To address this, we propose an efficient few-shot training framework for synthesizing industrial anomalies using diffusion models. Unlike prior generative methods that rely on redundant or semantically meaningless prompts (e.g., "sks"), our method leverages only normal data with minimal textual guidance. We build upon the Stable Diffusion 3 architecture and introduce lightweight architectural adaptations and a curated training strategy guided by vision-language models (VLMs). Our method generates realistic and diverse anomalies aligned with interpretable prompts such as "scratch" or "broken component", and further allows spatial localization through prompt engineering. During inference, we adopt a multi-prompt strategy with attention modulation to enable precise and controllable anomaly synthesis. Experimental results demonstrate that our synthesized anomalies significantly enhance downstream anomaly detection performance and exhibit strong generalization across various industrial categories, even under limited supervision.
