EMNLP 2025

November 06, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Speech Language Models (SLMs) enable natural interactions via spoken instructions, which has the potential to more effectively capture user intent by detecting nuances in their speech. This enhanced functionality, however, introduces new security risks as it enables adversaries to bypass safety mechanisms by injecting noise into the input. In this work, we analyze the vulnerability of open-source SLMs to adversarial attacks and evaluate various defense mechanisms. In our experiments, we harnessed a standard PGD in white-box scenario. We find that these models are susceptible to jailbreaks with 100% attack success rates in some instances. We propose post hoc defense techniques that include activation patching to improve robustness up to 99% with a negligible impact on utility. Additionally, we evaluate defenses applied at both the audio encoder and the language model components, weighing their impact on adversarial resistance and usability.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

FIRE: Flexible Integration of Data Quality Ratings for Effective Pretraining
poster

FIRE: Flexible Integration of Data Quality Ratings for Effective Pretraining

EMNLP 2025

+4
Xunliang Cai and 6 other authors

06 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2026 Underline - All rights reserved