Content not yet available

This lecture has no active video or poster.

AAAI 2026

January 25, 2026

Singapore, Singapore

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

We study the problem of learning a policy network to optimize several related objectives simultaneously in reinforcement learning (RL). Given a total of $n$ objectives, we consider finding a small set of $k$ policies that is much less than $n$, and that apply to all the objectives. This problem has broad applications in robotic control and language models. Learning one policy for all the objectives does not scale when the number of objectives becomes very large. Instead, this work introduces a two-stage, meta-training and adaptation procedure to tackle this problem. Our procedure works by first training a meta policy based on all the objectives. Then, we adapt this meta policy quickly to multiple subsets of randomly chosen objectives. This adaptation is enabled by a gradient-based approximation property of actor-critic agents, which we have empirically verified to be within a 2% error in a range of RL environments. This overall procedure, namely PolicyGradEx, can quickly estimate a task affinity score between every pair of objectives based on the estimated scores for each subset of objectives. Then, based on the estimated affinity scores, we apply a grouping procedure to cluster similar objectives into $k$ groups. Extensive experiments on three classic control benchmarks and the Meta-World benchmark demonstrate that our method outperforms state-of-the-art baselines by 16%, while being up to $26\times$ faster than full training. Ablation studies validate the design of each component of our method. For example, compared to random grouping and gradient-similarity-based grouping, our method outperforms both by 19%.

Downloads

Paper

Next from AAAI 2026

Landsat30-AU: A Vision-Language Dataset for Australian Landsat Imagery
poster

Landsat30-AU: A Vision-Language Dataset for Australian Landsat Imagery

AAAI 2026

Zhuang Li
Zhuang Li and 2 other authors

25 January 2026

Stay up to date with the latest Underline news!

Select topic of interest (you can select more than one)

PRESENTATIONS

  • All Presentations
  • For Librarians
  • Resource Center
  • Free Trial
Underline Science, Inc.
1216 Broadway, 2nd Floor, New York, NY 10001, USA

© 2025 Underline - All rights reserved