Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
keywords:
rag
llm
nlp
differential privacy
dataset
Retrieval-augmented generation (RAG) still forwards raw passages to large-language models, so private facts slip through. Prior defenses are either (i) heavyweight—full DP training that is impractical for today’s 70B-parameter models—or (ii) over-zealous—blanket redaction of every named entity, which slashes answer quality.
We introduce VAGUE-Gate, a lightweight, locally differentially-private gate deployable in front of any RAG system. A precision pass drops low-utility tokens under a user budget ε, then up to k(ε) high-temperature paraphrase passes further cloud residual cues; post-processing guarantees preserve the same ε-LDP bound.
To measure both privacy and utility, we release BlendPriv (3k blended-sensitivity QA pairs) and two new metrics: a lexical Information-Leakage Score and an LLM-as-Judge score. Across eight pipelines and four SOTA LLMs, VAGUE-Gate at ε = 0.3 lowers lexical leakage by 70% and semantic leakage by 1.8 points (1–5 scale) while retaining 91% of Plain-RAG faithfulness with only a 240 ms latency overhead.
All code, data, and prompts are publicly released: