
Premium content
Access to this content requires a subscription. You must be a premium user to view this content.

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
In this paper, we present the first detailed analysis of how optimization hyperparameters - such as learning rate, weight decay, momentum, and batch size - influence robustness against both transfer-based and query-based attacks.
Supported by theory and experiments, our study spans a variety of practical deployment settings, including centralized training, ensemble learning, and distributed training. We uncover a striking dichotomy: for transfer-based attacks, decreasing the learning rate significantly enhances robustness by up to 66%. In contrast, for query-based attacks, increasing the learning rate consistently leads to improved robustness by more than 28% across various settings and data distributions. Leveraging these findings, we explore - for the first time - the optimization hyperparameter design space to jointly enhance robustness against both transfer-based and query-based attacks. Our results reveal that distributed models benefit the most from hyperparameter tuning, achieving a remarkable tradeoff by simultaneously mitigating both attack types more effectively than other training setups.