China

Text anonymization is essential for responsibly developing and deploying AI in high-stakes domains such as healthcare, social services, and law. In this work, we propose a novel methodology for privacy-preserving synthetic text generation that leverages the principles of de-identification and the Hiding In Plain Sight (HIPS) theory. Our approach introduces entity-aware control codes to guide controllable generation using either in-context learning (ICL) or prefix tuning. The ICL variant ensures privacy levels consistent with the underlying de-identification system, while the prefix tuning variant incorporates a custom masking strategy and loss function to support scalable, high-quality generation. Experiments on legal and clinical datasets demonstrate that our method achieves a strong balance between privacy protection and utility, offering a practical and effective solution for synthetic text generation in sensitive domains.

EMNLP 2025

Controlled Generation for Private Synthetic Text

privacy protection

synthetic data generation

poster

## Welcome!
"I am excited to welcome you to this year’s edition of the Conference on Empirical Methods in Natural Language Processing! Importantly, it marks the 30th edition of EMNLP. With over 8,000 submissions, more than 3,000 accepted papers, and thousands of attendees, we have come a long way from that first
workshop, which had 14 accepted papers. As the field looks ahead, Suzhou is the fitting location for celebrating this milestone: rooted in a long literary tradition, yet modern and forward-looking, and home to a large share of the NLP community."<br>

*Message from the General Chair, Dirk Hovy*

[**Link to Conference Handbook**](https://drive.google.com/file/d/1johU5QqVVYO4RfH7QcIORr7qrVBdzdwC/view?usp=sharing)





<br>

Celebrate 30 Years of EMNLP! 
EMNLP 2025 will be held in Suzhou, China from November 5th to November 9th, 2025.

Large language models (LLMs) hold promise for therapeutic interventions, yet most existing datasets rely solely on text, overlooking non-verbal emotional cues essential to real-world therapy. To address this, we introduce a multimodal dataset of 1,441 publicly sourced therapy session videos containing both dialogue and non-verbal signals such as facial expressions and vocal tone. Inspired by Hochschild’s concept of emotional labor, we propose a computational formulation of \textit{emotional dissonance}—the mismatch between facial and vocal emotion—and use it to guide emotionally aware prompting. Our experiments show that integrating multimodal cues, especially dissonance, improves the quality of generated interventions. We also find that LLM-based evaluators misalign with expert assessments in this domain, highlighting the need for human-centered evaluation. Data and code will be released to support future research.

Towards AI-Assisted Psychotherapy: Emotion-Guided Generative Interventions

Non-English dialogue datasets are scarce, and models are often trained or evaluated on translations of English-language dialogues, an approach which can introduce artifacts that reduce their naturalness and cultural appropriateness. This work proposes Dialogue Act Script (DAS), a structured framework for encoding, localizing, and generating multilingual dialogues from abstract intent representations. Rather than translating dialogue utterances directly, DAS enables the generation of new dialogues in the target language that are culturally and contextually appropriate. By using structured dialogue act representations, DAS supports flexible localization across languages, mitigating translationese and enabling more fluent, naturalistic conversations. Human evaluations across Italian, German, and Chinese show that DAS-generated dialogues consistently outperform those produced by both machine and human translators on measures of cultural relevance, coherence, and situational appropriateness.

Multilingual Dialogue Generation and Localization with Dialogue Act Scripting

Large language model (LLM) agents have evolved to intelligently process information, make decisions, and interact with users or tools. A key capability is the integration of long-term memory capabilities, enabling these agents to draw upon historical interactions and knowledge. However, the growing memory size and need for semantic structuring pose significant challenges. In this work, we propose an autonomous memory augmentation approach, MemInsight, to enhance semantic data representation and retrieval mechanisms. By leveraging autonomous augmentation to historical interactions, LLM agents are shown to deliver more accurate and contextualized responses. We empirically validate the efficacy of our proposed approach in three task scenarios; conversational recommendation, question answering and event summarization. On the LLM-REDIAL dataset, MemInsight boosts persuasiveness of recommendations by up to 14%. Moreover, it outperforms a RAG baseline by 34% in recall for LoCoMo retrieval. Our empirical results show the potential of MemInsight to enhance the contextual performance of LLM agents across multiple tasks.

MemInsight: Autonomous Memory Augmentation for LLM Agents

Large Vision-Language Models (LVLMs) have achieved strong performance on vision-language tasks, particularly Visual Question Answering (VQA). While prior work has explored unimodal biases in VQA, the problem of selection bias in Multiple-Choice Question Answering (MCQA)—where models may favor specific option tokens (e.g., "A") or positions—remains underexplored. In this paper, we investigate both the presence and nature of selection bias in LVLMs through fine-grained MCQA benchmarks spanning easy, medium, and hard difficulty levels, defined by the semantic similarity of distractors. We further propose an inference-time logit-level debiasing method that estimates an ensemble bias vector from general and contextual prompts and applies confidence-adaptive corrections to the model’s output. Our method mitigates bias without retraining and is compatible with frozen LVLMs. Extensive experiments across several state-of-the-art models reveal consistent selection biases that intensify with task difficulty, and show that our mitigation approach significantly reduces bias while improving accuracy in challenging settings. This work offers new insights into the limitations of LVLMs in MCQA and presents a practical approach to improve their robustness in fine-grained visual reasoning.

Benchmarking and Mitigating MCQA Selection Bias of Large Vision-Language Models

As MT becomes commonplace, understanding how the general public perceives and relies on imperfect MT becomes critical. This paper contributes to the EMNLP 2025 theme of interdisciplinary recontextualization by bringing Human-Computer Interaction (HCI) methods to study these questions. We present a human study conducted in a public museum (n=452), investigating how fluency and adequacy errors impact bilingual and non-bilingual users' reliance on MT during casual use. Our findings reveal that non-bilingual users often over-rely on MT due to a lack of evaluation strategies and alternatives, while experiencing the impact of errors can prompt users to reassess future reliance. This highlights the need for MT evaluation and NLP explanation techniques to promote MT literacy. More broadly, this work illustrates recontextualizing NLP to address its societal implications.

Toward Machine Translation Literacy: How Lay Users Perceive and Rely on Imperfect Translations

Personalized content moderation can protect users from harm while facilitating free expression by tailoring moderation decisions to individual preferences rather than enforcing universal rules. However, content moderation that is fully personalized to individual preferences, no matter what these preferences are, may lead to even the most hazardous types of content being propagated on social media. In this paper, we explore this risk using hate speech as a case study. Certain types of hate speech are illegal in many countries. We show that, while fully personalized hate speech detection models increase overall user welfare (as measured by user-level classification performance), they also make predictions that violate such legal hate speech boundaries, especially when tailored to users who tolerate highly hateful content. To address this problem, we enforce legal boundaries in personalized hate speech detection by overriding predictions from personalized models with those from a boundary classifier. This approach significantly reduces legal violations while minimally affecting overall user welfare. Our findings highlight both the promise and the risks of personalized moderation, and offer a practical solution to balance user preferences with legal and ethical obligations.

Personalization up to a Point: Why Personalized Content Moderation Needs Boundaries, and How We Can Enforce Them

The faithful transfer of contextually-embedded meaning remains one of the most persistent challenges in contemporary machine translation (MT) and is particularly evident when dealing with culture-bound terms—expressions or concepts deeply rooted in specific languages or cultures, resisting direct linguistic transfer. Existing computational approaches to explicitating such terms have focused exclusively on in-text solutions, overlooking paratextual apparatus such as footnotes and endnotes systematically employed by professional translators. In this paper, we formalize Genette (1997)'s theory of paratexts from literary and translation studies to introduce the novel task of paratextual explicitation for MT. We construct a dataset of 560 expert-aligned paratexts from four English translations of the classical Chinese literary collection _Liaozhai_ and evaluate LLMs in implicit and explicit reasoning modes on both choice and content of explicitation. Experiments using three intrinsic prompting and one agentic retrieval method establish the inherent difficulty of this task, with human evaluation showing that LLM-generated paratexts improve audience comprehension 91.7% of the time, but with markedly less effectiveness than translator-authored ones. Our findings demonstrate the potential of paratextual explicitations for cultural mediation and advancing MT beyond surface-level equivalence, with promising extensions to monolingual explanation and personalized adaptation.

Liaozhai through the Looking-Glass: On Paratextual Explicitation of Culture-Bound Terms in Machine Translation

Distinguishing LLM-generated text from human-written is a key challenge for safe and ethical NLP, particularly in high-stakes settings such as persuasive online discourse. While recent work focuses on detection, real-world use cases also demand interpretable tools to help humans understand and distinguish LLM-generated texts. To this end, we present an analysis framework comparing human- and LLM-generated arguments using two easily-interpretable feature sets: general-purpose linguistic features (e.g., lexical richness, syntactic complexity) and domain-specific features related to argument quality (e.g., logical soundness, engagement strategies). Applied to */r/ChangeMyView* arguments by humans and three LLMs, our method reveals clear patterns: LLM-generated counter-arguments show lower type-token and lemma-token ratios but higher emotional intensity — particularly in anticipation and trust. They more closely resemble textbook-quality arguments — cogent, justified, explicitly respectful toward others, and positive in tone. Moreover, counter-arguments generated by LLMs converge more closely with the original post's style and quality than those written by humans. Finally, we demonstrate that these differences enable a lightweight, interpretable, and highly effective classifier for detecting LLM-generated comments in CMV.

AI Argues Differently: Distinct Argumentative and Linguistic Patterns of LLMs in Persuasive Contexts

Recently, autonomous agents built on large language models (LLMs) have experienced significant development and are being deployed in real-world applications. Through the usage of tools, these systems can perform actions in the real world. Given the agents' practical applications and ability to execute consequential actions, such autonomous systems can cause more severe damage than a standalone LLM if compromised. While some existing research has explored harmful actions by LLM agents, our study approaches the vulnerability from a different perspective. We introduce a new type of attack that causes malfunctions by misleading the agent into executing repetitive or irrelevant actions. Our experiments reveal that these attacks can induce failure rates exceeding 80% in multiple scenarios. Through attacks on implemented and deployable agents in multi-agent scenarios, we accentuate the realistic risks associated with these vulnerabilities. To mitigate such attacks, we propose self-examination defense methods. Our findings indicate these attacks are more difficult to detect compared to previous overtly harmful attacks, highlighting the substantial risks associated with this vulnerability.

Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification

The detection of audio deepfakes (ADD) has become increasingly important due to the rapid evolution of generative speech models. However, progress in this field remains uneven across languages, particularly for low-resource languages like Portuguese, which lack high-quality datasets. In this paper, we introduce BRSpeech-DF, the first publicly available ADD dataset for Portuguese, encompassing both Brazilian and European variants. The dataset contains over 459,000 utterances, including a smaller portion of real speech from 62 speakers and a large collection of synthetic samples generated using multiple zero-shot text-to-speech (TTS) models, each conditioned on the original speaker's voice. By providing this resource, our objective is to support the development of robust, multilingual detection systems, thereby advancing equity in speech forensics and security research. BRSpeech-DF addresses a significant gap in annotated data for underrepresented languages, facilitating more inclusive and generalizable advancements in synthetic speech detection.

Downloads

Next from EMNLP 2025

Towards AI-Assisted Psychotherapy: Emotion-Guided Generative Interventions

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES