
Premium content
Access to this content requires a subscription. You must be a premium user to view this content.

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
keywords:
mas
multiagent learning
The rapid advancement of multi-agent reinforcement learning(MARL) has given rise to divers training paradigms to learn the policies of each agents in the multi-agent system. The paradigms of decentralized training and execution(DTDE) and centralized training with decentralized execution(CTDE) has been proposed and widely applied. However, as the number of agents increases, the inherent limitations of these frameworks significantly degrade the performance metric, such as win rate, total reward, etc. To reduce the influence of the increasing number of agents on the performance metric, we propose a novel training paradigm of grouped training decentralized execution(GTDE). This framework eliminates the need for a centralized module and relies solely on local information, effectively meeting the training requirements of large-scale multi-agent systems. Specifically, we first introduce an adaptive grouping module, which divides each agent into different groups based on their observation history. To implement end-to-end training, GTDE uses Gumbel-Sigmoid for efficient point-to-point sampling on the grouping distribution while ensuring gradient backpropagation. To adapt to the uncertainty in the number of members in a group, two methods are used to implement a group information aggregation module that merges member information within the group. Empirical results show that in a cooperative environment with 495 agents, GTDE increased the total reward by an average of 8,000 compared to the baseline. In a competitive environment with 64 agents, GTDE achieved a 100\% win rate against the baseline.