EMNLP 2025

November 06, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Personality drift in Large Language Models (LLMs) poses a textitsafety and alignment risk: a model that shifts traits across a dialogue can produce inconsistent or harmful behaviours. Yet most existing psychometric evaluations probe LLMs in a context‑free vacuum, answering each item in isolation—what we call the textitDisney World test. We introduce the first textitcontext‑aware framework that transforms conversational history into a rigorous stress test for persona stability. Our method (i) simulates realistic multi‑turn interactions, (ii) proposes new consistency metrics to quantify alignment‑critical trait drift, and (iii) red‑teams models via prompt inconsistency factors. Across 7 frontier and open LLMs, conversational context boosts answer consistency through in‑context learning but also triggers notable personality shifts: textttGPT‑3.5‑Turbo and textttGPT‑4‑Turbo show the most extreme deviations. We find that textttGPT models remain robust to question ordering, whereas textttGemini‑1.5‑Flash and textttLlama‑3.1‑8B are highly order‑sensitive. Our causal analysis suggests textttGPT responses blend intrinsic persona signals with conversational cues, while textttGemini‑1.5‑Flash and textttLlama‑3.1‑8B rely predominantly on recent context, a potential vulnerability for adversarial steering. We further validate on Role‑Playing Agents, demonstrating that context‑aware alignment yields responses rated more consistent and human‑aligned. Our open‑sourced toolkit enables practitioners to diagnose and monitor persona‑driven safety risks before deployment.

Downloads

SlidesPaperTranscript English (automatic)

Next from EMNLP 2025

Where Does This Strange Smell Come from?: Enabling Conversational Interfaces for Artificial Olfaction
poster

Where Does This Strange Smell Come from?: Enabling Conversational Interfaces for Artificial Olfaction

EMNLP 2025

Dong-Kyu Chae
Dong-Kyu Chae and 2 other authors

06 November 2025

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved