Full text

Turn on search term navigation

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

In recent years, the Internet of Vehicles (IoV) has been found to be of huge potential value in the promotion of the development of intelligent transportation systems (ITSs) and smart cities. However, the traditional scheme in IoV has difficulty in dealing with an uncertain environment, while reinforcement learning has the advantage of being able to deal with an uncertain environment. Spectrum resource allocation in IoV faces the uncertain environment in most cases. Therefore, this paper investigates the spectrum resource allocation problem by deep reinforcement learning after using spectrum sensing technology in the ITS, including the vehicle-to-infrastructure (V2I) link and the vehicle-to-vehicle (V2V) link. The spectrum resource allocation is modeled as a reinforcement learning-based multi-agent problem which is solved by using the soft actor critic (SAC) algorithm. Considered an agent, each V2V link interacts with the vehicle environment and makes a joint action. After that, each agent receives different observations as well as the same reward, and updates networks through the experiences from the memory. Therefore, during a certain time, each V2V link can optimize its spectrum allocation scheme to maximize the V2I capacity as well as increase the V2V payload delivery transmission rate. However, the number of SAC networks increases linearly as the number of V2V links increases, which means that the networks may have a problem in terms of convergence when there are an excessive number of V2V links. Consequently, a new algorithm, namely parameter sharing soft actor critic (PSSAC), is proposed to reduce the complexity for which the model is easier to converge. The simulation results show that both SAC and PSSAC can improve the V2I capacity and increase the V2V payload transmission success probability within a certain time. Specifically, these novel schemes have a 10 percent performance improvement compared with the existing scheme in the vehicular environment. Additionally, PSSAC has a lower complexity.

Details

Title
A Deep Reinforcement Learning Scheme for Spectrum Sensing and Resource Allocation in ITS
Author
Huang, Wei 1 ; Yuyang Peng 1 ; Yue, Ming 1 ; Long, Jiale 2 ; AL-Hazemi, Fawaz 3   VIAFID ORCID Logo  ; Mohammad Meraj Mirza 4   VIAFID ORCID Logo 

 The School of Computer Science and Engineering, Macau University of Science and Technology, Macau 999078, China; [email protected] (H.W.); [email protected] (M.Y.) 
 Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen 529020, China; [email protected] 
 Department of Computer and Network Engineering, University of Jeddah, Jeddah 21959, Saudi Arabia; [email protected] 
 Department of Computer Science, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia; [email protected] 
First page
3437
Publication year
2023
Publication date
2023
Publisher
MDPI AG
e-ISSN
22277390
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2857121312
Copyright
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.