AAAI 2026

January 23, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

In the agricultural domain, the deployment of large language models (LLMs) is hindered by the lack of training data and evaluation benchmarks. To mitigate this issue, we propose AgriEval, the first comprehensive Chinese agricultural benchmark with three main characteristics: (1) \textit{Comprehensive Capability Evaluation.} AgriEval covers six major agriculture categories and 29 subcategories within agriculture, addressing four core cognitive scenarios—memorization, understanding, inference, and generation. (2) \textit{High-Quality Data.} The dataset is curated from university-level examinations and assignments, providing a natural and robust benchmark for assessing the capacity of LLMs to apply knowledge and make expert-like decisions. (3) \textit{Diverse Formats and Extensive Scale.} AgriEval comprises 14,697 multiple-choice questions and 2,167 open-ended question-and-answer questions, establishing it as the most extensive agricultural benchmark available to date. We also present comprehensive experimental results over 51 open-source and commercial LLMs. The experimental results reveal that most existing LLMs struggle to achieve 60\% accuracy, underscoring the developmental potential in agricultural LLMs. Additionally, we conduct extensive experiments to investigate factors influencing model performance and propose strategies for enhancement.

Downloads

Paper

Next from AAAI 2026

PrAda-GAN: A Private Adaptive Generative Adversarial Network with Bayes Network Structure
poster

PrAda-GAN: A Private Adaptive Generative Adversarial Network with Bayes Network Structure

AAAI 2026

+1
Ke Jia and 3 other authors

23 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved