Detecting Citation Hallucinations in Large Language Model Outputs (Student Abstract)

Vikranth Udandarao

AAAI 2026

•

January 23, 2026

•

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Large Language Models (LLMs) are increasingly employed for literature reviews, academic drafting, and scholarly writing. While their fluency accelerates knowledge synthesis, they frequently produce fabricated or erroneous references, known as citation hallucinations (CHs). Recent studies report hallucination rates ranging from 18% in GPT-4 to over 70% in other frontier models, with domain-specific rates as high as 88% in legal contexts. Benchmarks such as CiteME further highlight the gap between LLMs (4.2–18.5% accuracy) and human annotators (69.7%), while retrieval-augmented systems like CiteAgent demonstrate partial progress. This study examines methods for automatically detecting hallucinated citations. We present a benchmark of machine-generated references labelled with three fine-grained categories (valid, partially valid, and hallucinated), and propose a hybrid detection pipeline combining bibliographic retrieval, fuzzy similarity, and LLM-based verification. Preliminary experiments indicate improvements over exact matching baselines. We argue that scalable, real-time citation verification is a crucial step toward developing trustworthy LLM-based scholarly assistants and generating reproducible scientific knowledge, and outline directions for multilingual and domain-specific extensions.