Content area

Abstract

Reinforcement learning (RL), as an emerging interdisciplinary field formed by the integration of artificial intelligence and control science, is currently demonstrating a cross-disciplinary development trend led by artificial intelligence and has become a research hotspot in the field of optimal control. This paper systematically reviews the development context of RL, focusing on the intrinsic connection between single-agent reinforcement learning (SARL) and multi-agent reinforcement learning (MARL). Firstly, starting from the formation and development of RL, it elaborates on the similarities and differences between RL and other learning paradigms in machine learning, and briefly introduces the main branches of current RL. Then, with the basic knowledge and core ideas of SARL as the basic framework, and expanding to multi-agent system (MAS) collaborative control, it explores the coherence characteristics of the two in theoretical frameworks and algorithm design. On this basis, this paper reconfigures SARL algorithms into dynamic programming, value function decomposition and policy gradient (PG) type, and abstracts MARL algorithms into four paradigms: behavior analysis, centralized learning, communication learning and collaborative learning, thus establishing an algorithm mapping relationship from single-agent to multi-agent scenarios. This innovative framework provides a new perspective for understanding the evolutionary correlation of the two methods, and also discusses the challenges and solution ideas of MARL in solving large-scale MAS problems. This paper aims to provide a reference for researchers in this field, and to promote the development of cooperative control and optimization methods for MAS as well as the advancement of related application research.

Full text

Turn on search term navigation

© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.