Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Large Language Models have achieved significant advancements in various natural language processing tasks. However, they are susceptible to generating hallucinations-fabricated or inaccurate statements presented as factual information-which can undermine their reliability in high-stakes applications. To address this issue, we propose a new inference-stage hallucination mitigation method, Regularized Contrastive Decoding (RCD), to exploit hard negative samples for improving the robustness of contrastive decoding. Additionally, we design a new adversarial-aware regularization term to finetune hallucination models to learn more challenging and diverse hallucination patterns from available data with the guidance of adversarial perturbations. This enhances the contrastive decoding process, enabling more effective identification and filtering of erroneous content. We conduct experiments on four public hallucination benchmarks. Experimental results show our method achieves better hallucination mitigation performance consistently, proving the effectiveness and superiority of RCD for hallucination mitigation.