EMNLP 2025

November 07, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

The robustness and security of Large Language Models (LLMs) face increasing threats, especially in multilingual settings. A notable vulnerability is “jailbreaking” via translating harmful queries into rare or underrepresented languages, which often bypasses existing safeguards. In this work, we propose Multilingual Collaborative Defense (MCD), a novel learning method that optimizes a continuous soft safety prompt automatically to facilitate multilingual safeguarding of LLMs. MCD organically leverages collaborative signals from multiple languages by rotating each as the training “center,” allowing auxiliary languages to reinforce safety prompt learning and ensuring cross‑lingual consistency. As a result, MCD improves defense performance across all languages, reduces false refusals, and mitigates safety misalignment caused by corpus imbalance. To evaluate MCD, we construct multilingual versions of jailbreak benchmarks such as MaliciousInstruct and AdvBench, including zero-shot languages, to assess language transferability. Experiments show that MCD outperforms prior approaches in multilingual jailbreak defense while exhibiting strong cross-lingual generalization. Our code is available at https://anonymous.4open.science/r/MCD-0519.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

Beyond Spurious Signals: Debiasing Multimodal Large Language Models via Counterfactual Inference and Adaptive Expert Routing
poster

Beyond Spurious Signals: Debiasing Multimodal Large Language Models via Counterfactual Inference and Adaptive Expert Routing

EMNLP 2025

Yunfang Wu and 2 other authors

07 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved