United States

The centralized planning for simultaneous-move decentralized execution paradigm emerged as the state-of-the-art approach to $\epsilon$-optimally solving a decentralized partially observable Markov decision process. However, scalability remains a significant issue.
This paper presents a novel---more scalable---alternative, namely sequential central planning for simultaneous-move decentralized execution.
This methodology further pushes the applicability of bellman&#39;s principle of optimality, raising three new properties. 
First, it allows a central planner to reason upon sufficient sequential-move statistics instead of prior simultaneous-move ones.
Next, it proves that $\epsilon$-optimal value functions are piecewise linear and convex in such sufficient sequential-move statistics.
Finally, it drops the complexity of the backup operators from double exponential to polynomial at the expense of longer planning horizons.  
Besides, it makes it easy to use single-agent methods, e.g. SARSA algorithm enhanced with these findings applies while still preserving convergence guarantees.
Experiments on standard $2$- as well as $n$-agent benchmarks from the literature against state-of-the-art $\epsilon$-optimal simultaneous-move solvers, confirm the superiority of the novel approach.
This paradigm opens the door for efficient planning and reinforcement learning methods for multi-agent systems.

AAAI 2025

Optimally Solving Simultaneous-Move Dec-POMDPs: The Sequential Central Planning Approach

reinforcement learning

The centralized planning for simultaneous-move decentralized execution paradigm emerged as the state-of-the-art approach to $\epsilon$-optimally solving a decentralized partially observable Markov decision process. However, scalability remains a significant issue.
This paper presents a novel---more scalable---alternative, namely sequential central planning for simultaneous-move decentralized execution.
This methodology further pushes the applicability of bellman's principle of optimality, raising three new properties. 
First, it allows a central planner to reason upon sufficient sequential-move statistics instead of prior simultaneous-move ones.
Next, it proves that $\epsilon$-optimal value functions are piecewise linear and convex in such sufficient sequential-move statistics.
Finally, it drops the complexity of the backup operators from double exponential to polynomial at the expense of longer planning horizons.  
Besides, it makes it easy to use single-agent methods, e.g. SARSA algorithm enhanced with these findings applies while still preserving convergence guarantees.
Experiments on standard $2$- as well as $n$-agent benchmarks from the literature against state-of-the-art $\epsilon$-optimal simultaneous-move solvers, confirm the superiority of the novel approach.
This paradigm opens the door for efficient planning and reinforcement learning methods for multi-agent systems.

technical paper

We are pleased to announce the Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI-25), which will be held in Philadelphia, Pennsylvania at the Pennsylvania Convention Center from February 25 to March 4, 2025.

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-25 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.

### [Invited Speakers](https://aaai.org/conference/aaai/aaai-25/aaai-25-invited-speakers/)

Register [here](https://aaai.org/conference/aaai/aaai-25/registration/)

The purpose of the AAAI conference series is to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. AAAI-25 will feature technical paper presentations, special tracks, invited speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, and exhibit programs, and a range of other activities to be announced.



Coalition formation over graphs is a well studied class of games whose players are vertices and feasible coalitions 
must be connected subgraphs. In this setting, the existence and computation of equilibria, under various notions of stability, has attracted a lot of attention. However, the natural process by which players, starting from any feasible state, strive to reach an equilibrium after a series of unilateral improving deviations, has been less studied. We investigate the convergence of dynamics towards individually stable outcomes under the following perspective: what are the most general classes of preferences and graph topologies guaranteeing convergence? To this aim, on the one hand, we cover a hierarchy of preferences, ranging from the most general to a subcase of additively separable preferences, including individually rational and monotone cases. On the other hand, given that convergence may fail in graphs admitting a cycle even in our most restrictive preference class, we analyze acyclic graph topologies such as trees, paths, and stars.

Individually Stable Dynamics in Coalition Formation over Graphs

Formulating a real-world problem under the Reinforcement
Learning framework involves non-trivial design choices, such
as selecting a discount factor for the learning objective (dis-
counted cumulative rewards), which articulates the planning
horizon of the agent. This work investigates the impact of the
discount factor on the bias-variance trade-off given structural
parameters of the underlying Markov Decision Process. Our
results support the idea that a shorter planning horizon might
be beneficial, especially under partial observability.

On Shallow Planning Under Partial Observability

Personalized federated learning (PFL) has recently gained significant attention for its capability to address the poor convergence performance on highly heterogeneous data and the lack of personalized solutions of traditional federated learning (FL). Existing mainstream approaches either perform personalized aggregation based on a specific model architecture to leverage global knowledge or achieve personalization by exploiting client similarities. However, the former overlooks the discrepancies in client data distributions by indiscriminately aggregating all clients, while the latter lacks fine-grained collaboration of classifiers relevant to local tasks. In view of this challenge, we propose a Personalized Federated learning method for Enhancing Collaboration among Similar Classifiers (PFedCS), which aims at improving the client’s accuracy on local tasks. Concretely, it is achieved by leveraging awareness of the client classifier similarities to address the above problems. By iteratively measuring the distance of the classifier parameters between clients and clustering with each client as a cluster center, the central server adaptively identifies the collaborating clients with similar data distributions. In addition, a distance-constrained aggregation method is designed to generate customized collaborative classifiers to guide local training. As a result, extensive experimental evaluations conducted on three datasets demonstrate that our method achieves state-of-the-art performance.

PFedCS: A Personalized Federated Learning Method for Enhancing Collaboration among Similar Classifiers

This paper introduces Conformal Thresholded Intervals (CTI), a novel conformal regression method that aims to produce the smallest possible prediction set with guaranteed coverage. Unlike existing methods that rely on nested conformal frameworks and full conditional distribution estimation, CTI estimates the conditional probability density for a new response to fall into each interquantile interval using off-the-shelf multi-output quantile regression. By leveraging the inverse relationship between interval length and probability density, CTI constructs prediction sets by thresholding the estimated conditional interquantile intervals based on their length. The optimal threshold is determined using a calibration set to ensure marginal coverage, effectively balancing the trade-off between prediction set size and coverage. CTI's approach is computationally efficient and avoids the complexity of estimating the full conditional distribution. The method is theoretically grounded, with provable guarantees for marginal coverage and achieving the smallest prediction size given by Neyman-Pearson . Extensive experimental results demonstrate that CTI achieves superior performance compared to state-of-the-art conformal regression methods across various datasets, consistently producing smaller prediction sets while maintaining the desired coverage level. The proposed method offers a simple yet effective solution for reliable uncertainty quantification in regression tasks, making it an attractive choice for practitioners seeking accurate and efficient conformal prediction.

Conformal Thresholded Intervals for Efficient Regression

Recent advancements in Large Language Models (LLMs) have led to significant breakthroughs in various natural language processing tasks. However, generating factually consistent responses in knowledge-intensive scenarios remains a challenge due to issues such as hallucination, difficulty in acquiring long-tailed knowledge, and limited memory expansion. This paper introduces SMART, a novel multi-agent framework that leverages external knowledge to enhance the interpretability and factual consistency of LLM-generated responses. SMART comprises four specialized agents, each performing a specific sub-trajectory action to navigate complex knowledge-intensive tasks. We propose a multi-agent co-training paradigm, Long-Short Trajectory Learning, which ensures synergistic collaboration among agents while maintaining fine-grained execution by each agent. Extensive experiments on five knowledge-intensive tasks demonstrate SMART's superior performance compared to widely adopted knowledge internalization and knowledge enhancement methods. Our framework can extend beyond knowledge-intensive tasks to more complex scenarios.

Synergistic Multi-Agent Framework with Trajectory Learning for Knowledge-Intensive Tasks

Spatiotemporal forecasting (STF) is pivotal in urban computing, yet data scarcity in developing cities hampers robust model training. Addressing this, recent studies leverage transfer learning to migrate knowledge from data-rich (source) to data-poor (target) cities. This strategy, while effective, faces challenges as pre-trained models risk absorbing noise and harmful information due to data distribution disparities, potentially undermining the accuracy of forecasts for target cities.
% In this paper, we argue that cross-city transfer learning is to adjust the parameter searching space of the forecasting model on target city based on data from source cities. 
To address this issue, we propose a one-stage STF framework named Target-Skewed Joint Training (TSJT). Central to TSJT is a novel Target-Skewed Backward training strategy that selectively refines gradients from source city data, preserving only the elements that positively impact the target city. 
To further enhance the quality of these gradients, we have designed a Node Prompting Module (NPM). 
TSJT is crafted for seamless integration with existing STF models, endowing them with the capability to efficiently tackle challenges stemming from data scarcity. 
Experimental results on several real-world datasets from multiple cities substantiate the efficacy of TSJT in the realm of cross-city transfer learning.

Drawing Informative Gradients from Sources: A One-stage Transfer Learning Framework for Cross-city Spatiotemporal Forecasting

In Tennenholtz’s program equilibrium, players of a game submit programs to play on their behalf. Each program receives the other programs’ source code and outputs an action. This can model interactions involving AI agents, mutually transparent institutions, or commitments. (Tennenholtz 2004) proves a folk theorem for program games, but the equilibria constructed are very brittle. We therefore consider simulation-based programs – i.e., programs that work by running opponents’ programs. These are relatively robust (in particular, two programs that act the same are treated the same) and are more practical than proof-based approaches. Oesterheld’s (2019) $\epsilon$Grounded$\pi$Bot is such an approach. Unfortunately, it is not generally applicable to games of three or more players, and only allows for a limited range of equilibria in two player games. In this paper, we propose a generalisation to Oesterheld’s (2019) $\epsilon$Grounded$\pi$Bot. We prove a folk theorem for our programs in a setting with access to a shared source of randomness. We then characterise their equilibria in a setting without shared randomness. Both with and without shared randomness, we achieve a much wider range of equilibria than Oesterheld’s (2019) $\epsilon$Grounded$\pi$Bot. Finally, we explore the limits of simulation-based program equilibrium, showing that the Tennenholtz folk theorem cannot be attained by simulation-based programs without access to shared randomness.

Characterising Simulation-Based Program Equilibria

Despite the huge success of text-to-image (TTI) generation models, existing studies seldom consider whether generated images accurately represent factual information. In this paper, we define the problem of image hallucination as the generated images fail to accurately depict factual information. To address this, we introduce I-HallA (Image Hallucination evaluation with Question Answering), an automatic evaluation metric that measures the factuality of generated images through visual question answering (VQA), and I-HallA v1.0, a curated benchmark dataset. We develop a three-stage pipeline that generates curated question-answer pairs using multiple GPT-4 Omni-based agents with human judgments. Our evaluation protocols measure image hallucination by testing if images from existing text-to-image models can correctly answer these questions. The I-HallA v1.0 dataset comprises 1.2K diverse image-text pairs across 9 categories with varying levels of difficulty and 1,000 questions covering 9 compositions. We evaluate 5 different text-to-image models using I-HallA and demonstrate that these state-of-the-art models often fail to accurately convey factual information. Additionally, we establish the validity of our evaluation method through human evaluation, yielding a Spearman's correlation of 0.95. We believe our benchmark dataset and metric can serve as a foundation for developing factually accurate text-to-image generation models.

Evaluating Image Hallucination in Text-to-Image Generation with Question-Answering

The vision-based geo-localization technology for UAV, serving as a secondary source of GPS information in addition to the global navigation satellite systems (GNSS), can still operate independently when communication with the external environment is cut off.
Recent deep learning based methods attribute this as the task of image matching and retrieval.
By retrieving drone-view images in satellite image database with GPS information tagged, approximate localization information can be obtained.
However, due to high costs and privacy concerns, it is usually difficult to obtain large quantities of drone-view images from a continuous area.
Existing drone-view datasets are mostly composed of small-scale aerial photography with a strong assumption that there exists a perfect one-to-one aligned reference image for any query, leaving a significant gap from the practical localization scenario.
In this work, we construct a large-range continues area UAV geo-localization dataset named GTA-UAV, featuring multiple flight altitudes, attitudes, scenes, and targets using modern computer games.
Based on this dataset, we introduce a more practical UAV geo-localization task including partial matches of cross-view paired data, and expand the image-level retrieval to the actual localization in terms of distance (meters).
For the construction of drone-view and satellite-view pairs, we adopt a weight-based contrastive learning approach, which allows for effective learning while avoiding additional post-processing matching steps.
Experiments demonstrate the effectiveness of our data and training method for UAV geo-localization, as well as the generalization capabilities to real-world scenarios.

Game4Loc: A UAV Geo-Localization Benchmark from Game Data

Wearable Human Activity Recognition (WHAR) is a prominent research area within ubiquitous computing. Multi-sensor synchronous measurement is proven more effective for WHAR than using a single sensor. However, existing WHAR methods use shared convolutional kernels for indiscriminate temporal feature extraction across each sensor variable, which fails to effectively capture spatio-temperal relationships of intra-sensor and inter-sensor variables. We propose the **DecomposeWHAR** model consisting of a Modality-Aware Signal Decomposition phase and a Hierarchical Interaction Fusion phase to better model the relationships between variables. The Decomposition phase creates high-dimensional representations of each intra-sensor variable through Modality-Specific Embedding, followed by Depth-Wise Convolution to capture local temporal features while preserving their unique characteristics. In the fusion phase, the intra-sensor relationships are captured by Point-Wise Convolution at the channel level and variable level. Then we integrate features across the entire temporal level to capture long-range dependencies using the Global Temporal Aggregation module with Mamba. Finally, the cross-sensor interaction is implemented by the self-attention mechanism to capture the inter-sensor spatial correlations. Our model demonstrates superior performance on three widely used WHAR datasets, significantly outperforming State-of-the-Art (SOTA) models while maintaining acceptable computational efficiency.

Premium content

Downloads

Next from AAAI 2025

Individually Stable Dynamics in Coalition Formation over Graphs

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES