Content not yet available

This lecture has no active video or poster.

AAAI 2026

January 23, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Large language models (LLMs) frequently demonstrate reasoning limitations, often conflating content plausibility with logical validity. This can result in biased inferences, where plausible arguments are incorrectly deemed logically valid or vice versa. This paper investigates how to mitigate content biases on reasoning through activation steering, an inference-time intervention technique that modulates model activations. After localising the layers responsible for formal and material inference through probing, we investigate contrastive activation steering methods using a controlled syllogistic reasoning dataset that covers 24 types of logical argument schemes, designed to disentangle formal validity from content plausibility. An extensive empirical analysis reveals that contrastive steering consistently supports linear control over content biases. However, we observe that a static steering approach is insufficient for achieving improvements on all the tested models. We then leverage the possibility to control content effects by dynamically determining the value of the steering parameters via fine-grained conditional methods. We found that conditional steering is effective in reducing biases on unresponsive models, achieving up to 15% absolute improvement in formal reasoning accuracy with a newly introduced kNN-based conditional method. Finally, we found that steering for content effects is robust to prompt variations, incurs minimal side effects on multilingual language modeling capabilities, and can partially generalize to out-of-distribution tasks. Practically, this paper demonstrates that activation-level interventions can offer a scalable test-time strategy for enhancing the robustness of LLMs, contributing towards more systematic and unbiased reasoning

Downloads

SlidesPaper

Next from AAAI 2026

Mamba-Driven Multi-View Discriminative Clustering via Global-Local Cross-View Sequence Modeling
technical paper

Mamba-Driven Multi-View Discriminative Clustering via Global-Local Cross-View Sequence Modeling

AAAI 2026

+5Xinhang Wan
Cunjian Chen and 7 other authors

23 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved