Lecture image placeholder

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99Pay per view - $4.99Access through your institutionLogin with Underline account
Need help?
Contact us
Lecture placeholder background

EMNLP 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Alignment of large language models (LLM) is a process that ensures the model’s responses to user prompts align with human intentions and social values. This optimization typically relies on pre-collected prompts. The collection of these prompts often either requires careful human interventions or proves to be difficult to have a good coverage over all scenarios an LLM can improve over . To address this issue, we propose an alignment method based on a two-agent game, consisting of an adversarial agent and a defensive agent. The adversarial agent’s task is to generate prompts that expose the deficiencies of the defensive agent. At the same time, the defensive agent improves its performance on the prompts generated by the adversary based on feedback from the reward model. This iterative process is repeated to enhance the model’s performance. We theoretically demonstrate that, under mild assumptions, this iterative alignment process converges to a Nash equilibrium by both agents. Learning in this competitive environment results in policies with better generalization capabilities. We demonstrate the advantage of our framework using extensive experiments.

Downloads

Paper
access premium content

Next from EMNLP 2025

Bhaasha, Bhāṣā, Zaban: A Survey for Low-Resourced Languages in South Asia – Current Stage and Challenges
poster

Bhaasha, Bhāṣā, Zaban: A Survey for Low-Resourced Languages in South Asia – Current Stage and Challenges

EMNLP 2025

Xiaolei Huang
Xiaolei Huang and 1 other author

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2026 Underline - All rights reserved