
Premium content
Access to this content requires a subscription. You must be a premium user to view this content.

poster
Ask Again, Then Fail: Large Language Models’ Vacillations in Judgment
keywords:
judgment consistency
llm
alignment
We observe that current large language models often waver in their judgments when faced with follow-up questions, even if the original judgment was correct. This wavering presents a significant challenge for generating reliable responses and building user trust. To comprehensively assess this issue, we introduce a \textsc{Follow-up Questioning Mechanism} along with two metrics to quantify this inconsistency, confirming its widespread presence in current large language models. Furthermore, to mitigate this issue, we explore various prompting strategies for closed-source models, and develop a training-based framework \textsc{Unwavering-FQ} that teaches large language models to maintain their originally correct judgments through synthesized high-quality preference data. Our experimental results confirm the effectiveness of our framework and its ability to enhance the general capabilities of large language models.