Abstract

We consider the problem of multi agents cooperating in a partially-observable environment. Agents must learn to coordinate and share relevant information to solve the tasks successfully. This article describes Asynchronous Advantage Actor-Critic with Communication (A3C2), an end-to-end differentiable approach where agents learn policies and communication protocols simultaneously. A3C2 uses a centralized learning, distributed execution paradigm, supports independent agents, dynamic team sizes, partially-observable environments, and noisy communications. We compare and show that A3C2 outperforms other state-of-the-art proposals in multiple environments.

Details

Title
Multi Agent Deep Learning with Cooperative Communication
Author
Simões, David; Lau, Nuno; Reis, Luís Paulo
Pages
189-207
Publication year
2020
Publication date
2020
Publisher
De Gruyter Poland
e-ISSN
24496499
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2545231383
Copyright
© 2020. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.