Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
In cooperative Multi-Agent Reinforcement Learning (MARL), the subgroup-wise learning is employed to assign sub-tasks to agents towards the enhancement of team collaboration. However, the present work is dependent on manually defined allocation criteria, which hinders its capacity to adapt to environmental changes promptly, and also relaxes communication restrictions, thereby constraining the application of algorithms in a range of fields. In order to address these issues, the Autonomous Partner Selection (APS) framework is proposed, which offers an implicit grouping mechanism in an autonomous way. Each agent is capable of autonomously selecting cooperative partners and integrating their own observations with those of partners to harmonise the cooperative behaviour during the training stage. With a view to strictly restricting communication, the intention encoder is trained through information distillation, which enables agents to selectively take more cooperative actions based solely on local observations. Meanwhile, in order to circumvent potential conflicts engendered by homogenization behaviour, we employ a contrastive learning strategy to the cooperative intention generated by agents, thereby ensuring that the behavioural tendencies exhibited by different individuals remain as diverse as possible. Finally, extensive comparative experiments on the StarCraft Multi-Agent Challenge and Google Research Football are conducted. The results demonstrate that APS exhibits superior performance in comparison to the state-of-the-art algorithms across a range of tasks, and agents can adapt their grouping strategies in accordance with the environment to facilitate enhanced cooperation.
