Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Diffusion models have become a leading class of generative models, especially conditional ones that support prompt-driven image synthesis. While recent research emphasizes the pivotal role of noise seeds in enhancing text-image alignment and generating human-preferred outputs, current approaches predominantly rely on random Gaussian noise or heuristic local adjustments, lacking a comprehensive global optimization framework. To bridge this gap, we propose Seed Optimization based on Evolution (SOE), a novel hybrid approach integrating a global search mechanism—an evolutionary algorithm coupled with multi-scale random sampling, guided by a dual-seed evaluation framework combining CLIP-based text-image alignment scores and ImageReward-based human-preference rewards—and a local refinement strategy that employs inversion techniques to inject conditional information into noise seeds. This local optimization leverages the diffusion inversion process to encode prompt semantics into noise. Extensive experiments across various diffusion models validate the effectiveness and generalizability of SOE in optimizing noise seeds.