Singapore

Olympiad-level benchmarks in mathematics and physics are crucial testbeds for advanced AI reasoning, but chem- istry, with its unique multimodal symbolic language, has remained an open challenge. We introduce ChemO, a new benchmark built from the International Chemistry Olympiad (IChO) 2025. ChemO features two key inno- vations for automated assessment: Assessment-Equivalent Reformulation (AER), which converts problems requiring visual outputs (e.g., drawing molecules) into computation- ally tractable formats, and Structured Visual Enhancement (SVE), a diagnostic mechanism to disentangle a model’s vi- sual perception capabilities from its core chemical reason- ing. To tackle this benchmark, we propose ChemLabs, a hierarchical multi-agent framework that mimics human ex- pert collaboration through specialized agents for problem decomposition, perception, reasoning, and auditing. Exper- iments on state-of-the-art multimodal models demonstrate that combining SVE with our multi-agent system yields dra- matic performance gains. Our top configuration achieves a score of 93.6 out of 100, surpassing an estimated hu- man gold medal threshold and establishing a new state- of-the-art in automated chemical problem-solving.

AAAI 2026

ChemLabs on ChemO: A Multi-Agent System for Multimodal Reasoning on IChO 2025

workshop paper

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-26 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.<br><br>

To access this event page, you need to log in with the **email address you registered with**. <br>Access credentials will be sent to your email from Underline -  subject line "Welcome to AAAI 2026". Please be sure to check your spam email folder if you do not see an email confirmation right away.

Please log in

To access this event page, you are required to register.
Please complete your registration to continue.

We recommend reading [**the registration information**](https://aaai.org/conference/aaai/aaai-26/registration/) first.

**Online Registration Form**: https://aaai.getregistered.net/conference-2026 

Registration Required

We are pleased to announce the Fortieth AAAI Conference on Artificial Intelligence (AAAI-26), which will be held in Singapore EXPO from January 20 to January 27, 2026.

Consider a system of multiple physical agents tasked with collaboratively collecting a set of spatially distributed goals while avoiding collisions with the environment and with each other. 
This type of problem, which combines Multi-Agent Path Finding (MAPF) with task allocation, is known as Multi-Agent Combinatorial Path Finding (MCPF). 
Conflict-Based Steiner Search (CBSS) is an optimal algorithm for MCPF, which assumes that each agent has a fixed goal destination. It selects allocations that yield a solution minimizing the sum of costs (SOC), which we denote as CBSS$_{SOC}$. 
However, this objective is problematic in domains such as search and rescue, where timely service of all goals is more critical than minimizing SOC. 
We therefore propose CBSS$_{SST}$, which minimizes the Sum of Service Times (SST) across all goals using a novel mixed-integer linear programming allocation, thereby generalizing MCPF to settings without requiring fixed goal destinations.
Since CBSS assumes perfect execution, we extend it with robust planning to handle stochastic execution delays. 
We propose two variants of CBSS$_{SST}$: Robust CBSS$_{SST}$ with Strict Verifier, which guarantees the desired robustness, and Robust CBSS$_{SST}$ with Anytime Verifier, which addresses planning-time constraints by returning the most robust solution verified within the available time. 
Our experiments on MCPF benchmarks show that Robust CBSS with anytime verifier solves substantially more instances than when using a strict verified within the time limit, while reducing replanning effort and preserving robustness. 
These results demonstrate that RCbss with an anytime verifier provides an effective and practical approach to MCPF under uncertainty.

Balancing Robustness and Efficiency in Multi-Agent Combinatorial Path Finding with Sum of Service Time

Grid-based path planning is a classic problem in AI, widely applied in robotics, computer games, and scheduling. Two oracle path planning (Topping) is a state-of-the-art fast path-finding method for grid maps. Topping iteratively utilizes SRC and JPS oracles to determine the first moves and number of steps, respectively. This enables faster search than SRC, yet incurs high storage and search costs due to inadequate compression. In this paper, we aim to leverage heuristic information as much as possible to enhance the compression performance of Topping and to further improve the search efficiency (i.e., the first-move decision cost). Ultimately, this also improves Topping’s overall search performance. Experiments on five benchmarks (478 maps in total) show that our methods can reduce the first-move decision cost by an average of about 60% (maximum 71%) and achieve a maximum speedup of 48% in runtime. Remarkably, they also have gains in compression performance and reduce storage costs.

Two-Oracle Path Planning Consolidated by Heuristic-Rich Information

Coordinating a team of robots to reposition multiple objects in cluttered environments requires reasoning jointly about where robots should establish contact, how to manipulate objects once contact is made, and how to navigate safely and efficiently at scale. Prior approaches typically fall into two extremes–either learning the entire task or relying on privileged information and hand-designed planners–both of which struggle to handle diverse objects in long-horizon tasks. To address these challenges, we present a unified framework for collaborative multi-robot, multi-object non-prehensile manipulation that integrates flow-matching co-generation with anonymous multi-robot motion planning. Within this framework, a generative model co-generates contact formations and manipulation trajectories from visual observations, while a novel motion planner conveys robots at scale. Crucially, the same planner also supports coordination at the object level, assigning manipulated objects to larger target structures and thereby unifying robot- and object-level reasoning within a single algorithmic framework. Experiments in challenging simulated environments demonstrate that our approach outperforms baselines in both motion planning and manipulation tasks, highlighting the benefits of generative co-design and integrated planning for scaling collaborative manipulation to complex multi-agent, multi-object settings. Visit gco-paper.github.io for code and demonstrations.

Collaborative Multi-Robot Non-Prehensile Manipulation via Flow-Matching Co-Generation

Multi-agent pathfinding (MAPF) is the problem of finding collision-free paths for a set of agents in a shared environment, typically represented as a graph. One of the approaches to solving MAPF represents the problem as a Boolean satisfiability problem. However, due to the encoding required to represent the valid paths, this method can produce extremely large Boolean formulas, both in terms of variables and clauses. Herein, we propose two encodings of MAPF designed for SAT Modulo Theories solvers. Our approach delegates all the valid path reasoning to a monotonic theory supporting source-target connectivity. This is then combined with a 2-SAT Boolean formula to prevent collisions between agents. Together, these components create an effective separation of concerns: the SAT solver focuses on resolving conflicts, while the theory solver handles the connectivity constraints. Our experiments are conducted in both makespan and sum of costs optimisation settings, empirically demonstrating a notable reduction in both the size of the MAPF encoding and the time required to generate it. In addition, when fixing the SAT solver across experiments, results demonstrate considerable performance improvements when transitioning from pure SAT to our proposed SMT encodings.

Modelling Multi-Agent Pathfinding Problems by Integrating Connectivity and No-Collision Constraints

Multi-agent pathfinding (MAPF) is a widely used abstraction for multi-robot trajectory planning problems, where multiple homogeneous agents move simultaneously within a shared environment. Although solving MAPF optimally is NP-hard, scalable and efficient solvers are critical for real-world applications such as logistics and search-and-rescue. To this end, the research community has proposed various decentralized suboptimal MAPF solvers that leverage machine learning. Such methods frame MAPF (from a single agent perspective) as Dec-POMDP when at each time step an agent has to decide an action based on the local observation and typically solve the problem via reinforcement learning or imitation learning. We follow the same approach but additionally introduce a learnable communication module tailored to increase the level of cooperation between the agents via efficient feature sharing. We present the Local Communication for Multi-agent Pathfinding (LC-MAPF), a foundation model that applies multi-round communication between neighboring agents to exchange information and improve their coordination. Our experiments show that the introduced method outperforms the existing learning-based MAPF solvers, including IL and RL based approaches, across diverse metrics in a diverse range of (unseen) test scenarios. Remarkably, the introduced communication mechanism does not compromise the scalability LC-MAPF, which is a common bottleneck for communication-based MAPF solvers.

Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding

One important component of large industrial electrical projects is the placement of cables that are routed between various pieces of equipment. Many large contractors are still determining routes manually through informal walkthroughs and visual assessment of possible paths. Thus, there is significant room for automation and improvement in this process. This paper describes the general problem, relating it to established problems such as the Multi-Agent Pathfinding Problem (MAPF). We cover some of the practical complications of the real-world problem, describe several abstractions of the problem, and then discuss possible algorithms and approaches for solutions that could be deployed in the real world.

Optimization of Cable Routing During Construction

Decentralized collision avoidance remains a core challenge for scalable multi-robot systems. One of the promising approaches to tackle this problem is Model Predictive Path Integral (MPPI) -- a framework that is naturally suited to handle any robot motion model and provides strong theoretical guarantees. Still, in practice MPPI-based controller may provide suboptimal trajectories as its performance relies heavily on uninformed random sampling. In this work, we introduce CoRL-MPPI, a novel fusion of Cooperative Reinforcement Learning and MPPI to address this limitation. We train an action policy (approximated as deep neural network) in simulation that learns local cooperative collision avoidance behaviors. This learned policy is then embedded into the MPPI framework to guide its sampling distribution, biasing it towards more intelligent and cooperative actions. Notably, CoRL-MPPI preserves all the theoretical guarantees of regular MPPI. We evaluate our approach in dense, dynamic simulation environments against state-of-the-art baselines, including ORCA, BVC, and a multi-agent MPPI implementation. Our results demonstrate that CoRL-MPPI significantly improves navigation efficiency (measured by success rate and makespan) and safety, enabling agile and robust multi-robot navigation.

CoRL-MPPI: Enhancing MPPI With Learnable Behaviours For Efficent And Provably-Safe Multi-Robot Collision Avoidance

Multi-agent reinforcement learning (MARL) is a powerful paradigm for solving cooperative and competitive decision-making problems. While many MARL benchmarks have been proposed, few combine continuous state and action spaces with challenging coordination and planning tasks. We introduce CAMAR, a new MARL benchmark designed explicitly for multi-agent pathfinding in environments with continuous actions. CAMAR supports cooperative and competitive interactions between agents and runs efficiently at up to 100,000 environment steps per second. We also propose a three-tier evaluation protocol to better track algorithmic progress and enable deeper analysis of performance. In addition, CAMAR allows the integration of classical planning methods such as RRT and RRT* into MARL pipelines. We use them as standalone baselines and combine RRT* with popular MARL algorithms to create hybrid approaches. We provide a suite of test scenarios and benchmarking tools to ensure reproducibility and fair comparison. Experiments show that CAMAR presents a challenging and realistic testbed for the MARL community.

CAMAR: Continuous Actions Multi-Agent Routing

Multi-Agent Motion Planning is a problem of finding a set of collision-free trajectories, one for each agent, to move from their start configurations to their goal configurations while minimizing the sum of travel time. To solve MAMP, we propose Dual Conflict-Based Search (Dual-CBS), which combines the search and sampling approaches in a hierarchical manner. Dual-CBS first decouples the configuration space into grids and finds a set of grid-based paths. Then, it samples one trajectory for each agent, which is a sequence of configurations within its grid-based path. In comparison to approaches that find collision-free trajectories on a shared roadmap, Dual-CBS does not introduce heavy computational overhead in constructing such a roadmap. Meanwhile, it maintains the mobility of agents for efficiently resolving collisions. In comparison to the state-of-the-art MAMP approach, Simultaneous Sampling-and-Search (SSSP), which heavily relies on local collision avoidance, Dual-CBS guides the sampled trajectories with the grid-based paths and thus finds solutions with a lower sum of travel time.

Dual-CBS: A Hierarchical Approach via Conflict-Based Search and Sampling for Multi-Agent Motion Planning

Recently, multi-robot systems have gained significant attention for their promise of scalable efficiency, reliability, and cost savings. A crucial capability is collaborative transportation, where a team of robots works together to transport a payload. However, key challenges remain, such as potential conflicts between team-level decisions and individual-level robot controls, team kinematic constraints imposed by the robot-payload coupling, and diverse obstacles encountered in 3D terrain. We present Collaborative Quadruped Transportation with Constrained Diffusion (CQTD), enabling a team of closely coupled quadruped robots to collaboratively transport a payload across 3D terrain. A diffusion-based upper level learns terrain-aware team-level trajectories satisfying team kinematic constraints due to the payload coupling, while a lower level optimizes velocity controls of individual robots satisfying collision and anisotropic velocity constraints. Experiments in high-fidelity simulations and on real-world quadruped robot teams demonstrate that CQTD outperforms baseline methods in challenging 3D terrain scenarios requiring closely-coupled collaboration between the quadruped robots.

Premium content

Next from AAAI 2026

Balancing Robustness and Efficiency in Multi-Agent Combinatorial Path Finding with Sum of Service Time

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES