Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Negation reasoning remains a challenge for large language models (LLMs), often causing incorrect interpretations of negated statements. In this study, we analyze various LLMs for their handling of negation and propose two genres of prompts Warning-based and Persona-based, which improve overall accuracy by up to 3.17% and distractor negation accuracy by up to 25.14% over most competitive baselines. Next, we assess the robustness of LLMs by reordering prompts while preserving meaning, observing instability linked to positional encoding schemes. Further, we introduce a negative token attention score (NTAS) to quantify attention to negation words. From the comprehensive analysis, we point out that within a specific LLM family, the performance of a model (measured using accuracy) correlates more with NTAS than with model size.