Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Understanding the emergence of collective behaviors of multi-agent systems requires investigating the learning dynamics. However, the theoretical analysis of large-scale graph-structured multi-agent reinforcement learning (MARL) systems remains challenging due to agent heterogeneity and the intrinsic coupling between state transitions and individual Q-value updates. In this work, we develop a unified theoretical framework that captures the evolution of agent behaviors at both individual and population levels. By leveraging pair-approximation technique from statistical physics, we derive a closed set of evolution equations that accurately describe the temporal dynamics of the system. Our analysis also reveals a separation of time scales. For small learning rates, state transitions equilibrate rapidly, while Q-value updates evolve slowly with stationary state distributions. Through extensive agent-based simulations, we validate the robustness of our theoretical results and explain the mechanisms that lead to the emergence of cooperation in social dilemmas. Our framework offers new perspectives for bridging complex systems science and MARL, providing insights for the design of cooperative and resilient AI.
