Content area

Abstract

Reinforcement learning (RL), as an emerging interdisciplinary field formed by the integration of artificial intelligence and control science, is currently demonstrating a cross-disciplinary development trend led by artificial intelligence and has become a research hotspot in the field of optimal control. This paper systematically reviews the development context of RL, focusing on the intrinsic connection between single-agent reinforcement learning (SARL) and multi-agent reinforcement learning (MARL). Firstly, starting from the formation and development of RL, it elaborates on the similarities and differences between RL and other learning paradigms in machine learning, and briefly introduces the main branches of current RL. Then, with the basic knowledge and core ideas of SARL as the basic framework, and expanding to multi-agent system (MAS) collaborative control, it explores the coherence characteristics of the two in theoretical frameworks and algorithm design. On this basis, this paper reconfigures SARL algorithms into dynamic programming, value function decomposition and policy gradient (PG) type, and abstracts MARL algorithms into four paradigms: behavior analysis, centralized learning, communication learning and collaborative learning, thus establishing an algorithm mapping relationship from single-agent to multi-agent scenarios. This innovative framework provides a new perspective for understanding the evolutionary correlation of the two methods, and also discusses the challenges and solution ideas of MARL in solving large-scale MAS problems. This paper aims to provide a reference for researchers in this field, and to promote the development of cooperative control and optimization methods for MAS as well as the advancement of related application research.

Details

10000008
Business indexing term
Location
Company / organization
Title
Reinforcement learning for single-agent to multi-agent systems: from basic theory to industrial application progress, a survey
Author
Zhang, Dehua 1 ; Yuan, Qingsong 1 ; Meng, Lei 1 ; Xia, Ruixue 1 ; Liu, Wei 2 ; Qin, Chunbin 1 

 Henan University, School of Artificial Intelligence, Zhengzhou, China (GRID:grid.256922.8) (ISNI:0000 0000 9139 560X) 
 Nanyang Normal University, School of Intelligent Manufacturing and Electrical Engineering (Collaborative Innovation Center of Intelligent Explosion-proof Equipment, Henan Province), Nanyang, China (GRID:grid.453722.5) (ISNI:0000 0004 0632 3548) 
Publication title
Volume
59
Issue
2
Pages
46
Publication year
2026
Publication date
Feb 2026
Publisher
Springer Nature B.V.
Place of publication
Dordrecht
Country of publication
Netherlands
ISSN
02692821
e-ISSN
15737462
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-11-27
Milestone dates
2025-10-30 (Registration); 2025-04-29 (Received); 2025-10-30 (Accepted)
Publication history
 
 
   First posting date
27 Nov 2025
ProQuest document ID
3291095910
Document URL
https://www.proquest.com/scholarly-journals/reinforcement-learning-single-agent-multi-systems/docview/3291095910/se-2?accountid=208611
Copyright
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2026-01-08
Database
ProQuest One Academic