Content not yet available

This lecture has no active video or poster.

AAAI 2026

January 22, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

Multimodal large reasoning models (MLRMs) have advanced visual-textual integration, enabling sophisticated human-AI interaction. While prior work has exposed MLRMs to visual jailbreaks, it remains underexplored how their reasoning capabilities reshape the security landscape under adversarial inputs. To fill this gap, we conduct a systematic security assessment of MLRMs and uncover a security-reasoning paradox: although deeper reasoning boosts cross‑modal risk recognition, it also creates cognitive blind spots that adversaries can exploit. We observe that MLRMs oriented toward human-centric service are highly susceptible to users' emotional cues during the deep-thinking stage, often overriding safety protocols or built‑in safety checks under high emotional intensity. Inspired by this key insight, we propose \textbf{EmoAgent}, an autonomous adversarial emotion-agent framework that orchestrates exaggerated affective prompts to hijack reasoning pathways. Even when visual risks are correctly identified, models can still produce harmful completions through emotional misalignment. We further identify persistent high-risk failure modes in transparent deep-thinking scenarios, such as MLRMs generating harmful reasoning masked behind seemingly safe responses. These failures expose misalignments between internal inference and surface-level behavior, eluding existing content-based safeguards. To quantify these risks, we introduce three metrics: (1) \emph{Risk-Reasoning Stealth Score (RRSS)} for harmful reasoning beneath benign outputs; (2) \emph{Risk-Visual Neglect Rate (RVNR)} for unsafe completions despite visual risk recognition; and (3) \emph{Refusal Attitude Inconsistency (RAIC)} for evaluating refusal unstability under prompt variants. Extensive experiments on advanced MLRMs demonstrate the effectiveness of EmoAgent and reveal deeper emotional cognitive misalignments in model safety behavior. \textbf{ Warning: This paper contains examples that may be offensive or harmful.}

Downloads

Paper

Next from AAAI 2026

Decoding with Structured Awareness: Integrating Directional, Frequency-Spatial, and Structural Attention for Medical Image Segmentation
poster

Decoding with Structured Awareness: Integrating Directional, Frequency-Spatial, and Structural Attention for Medical Image Segmentation

AAAI 2026

Zhiwei Gu and 2 other authors

22 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved