Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Although LLM-based conversational agents demonstrate strong fluency and coherence, they still produce undesirable behaviors (textiterrors) that are challenging to prevent from reaching users during deployment. Recent research leverages large language models (LLMs) to detect errors and guide response-generation models toward improvement. However, current LLMs struggle to identify errors not explicitly specified in their instructions, such as those arising from updates to the response-generation model or shifts in user behavior. In this work, we introduce textbfAutomated Error Discovery, a framework for detecting and defining errors in conversational AI, and propose textbfSEEED (underlineSoft-clustering underlineExtended underlineEncoder-Based underlineError underlineDetection), as an encoder-based approach to its implementation. We enhance the Soft Nearest Neighbor Loss by amplifying distance weighting for negative samples and introduce textbfLabel-Based Sample Ranking to select highly contrastive examples for better representation learning. SEEED outperforms adapted baselines---including GPT-4o and Phi-4---across multiple error-annotated dialogue datasets, improving the accuracy for detecting unknown errors by up to 8 points and demonstrating strong generalization to unknown intent detection.