AAAI 2026 Main Conference

January 24, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

As corporate responsibility increasingly incorporates environmental, social, and governance (ESG) criteria, ESG reporting is becoming a legal obligation in many regions. These reports serve as a primary mechanism for organizations to document sustainability practices and for stakeholders to evaluate long-term viability and ethical performance. Ensuring regulatory compliance demands disclosures that are accurate, transparent, and verifiable. However, the complexity and scale of ESG disclosures present challenges for interpretation and automated analysis. To facilitate scalable and trustworthy analysis of these reports, this paper introduces ESG-Bench, a novel benchmark dataset aimed at advancing research in ESG report understanding and hallucination mitigation for large language models (LLMs). ESG-Bench consists of human-annotated question–answer (QA) pairs grounded in real-world ESG report contexts, along with fine-grained labels indicating whether model responses are factually supported or hallucinated. By framing ESG report analysis as a QA task with verifiability constraints, ESG-Bench enables systematic evaluation of LLMs' ability to extract and reason over ESG content. We also uncover a previously unexplored use case: applying ESG-Bench to mitigate hallucinations in socially sensitive and compliance-critical contexts. To this end, we design task-specific Chain-of-Thought (CoT) prompting strategies and fine-tune multiple state-of-the-art LLMs on ESG-Bench using CoT-annotated rationales. Experimental results demonstrate that these CoT-based strategies substantially outperform standard prompting and direct fine-tuning, effectively mitigating hallucinations across benchmarks and highlighting the unique challenges of long-context document reasoning in the ESG setting. We also evaluate our approach across existing QA benchmarks to assess generalization beyond the ESG domain.

Downloads

Paper

Next from AAAI 2026 Main Conference

BSAN: Behavioral State Attention Network for Modeling Mosquito Host-Seeking Behavior
poster

BSAN: Behavioral State Attention Network for Modeling Mosquito Host-Seeking Behavior

AAAI 2026 Main Conference

Jessie Li and 2 other authors

24 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved