TY - CHAP
T1 - Channel Selection and Power Control for D2D Communication via Online Reinforcement Learning
AU - Sun, Zhenfeng
AU - Nakhai, Mohammad Reza
N1 - Publisher Copyright:
© 2021 IEEE.
Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2021/6
Y1 - 2021/6
N2 - In this paper, we address the problem of Device-to-Device communication (D2D) in the next generations of cellular networks, where the number of D2D pairs can grow large and, hence, improving spectral efficiency becomes a crucial design factor. More specifically, decentralized channel selection and power control by D2D pairs for interference mitigation without inflicting a heavy controlling overhead on the network become significant challenges in allocating resources. To this end, we introduce an online distributed reinforcement learning algorithm at D2D pairs to maximize network throughput, while guaranteeing both D2D users' and cellular users' (CUs) Quality of Service (QoS) under the dynamic wireless channel environment. To track and evaluate the performance of the proposed online algorithm, we define three metrics, i.e., D2D collision probability, D2D access rate and time-average network throughput. The simulation results confirm the convergence property of the proposed algorithm and shows improved performance in terms of three defined metrics as compared to the celebrated Q-learning-based method.
AB - In this paper, we address the problem of Device-to-Device communication (D2D) in the next generations of cellular networks, where the number of D2D pairs can grow large and, hence, improving spectral efficiency becomes a crucial design factor. More specifically, decentralized channel selection and power control by D2D pairs for interference mitigation without inflicting a heavy controlling overhead on the network become significant challenges in allocating resources. To this end, we introduce an online distributed reinforcement learning algorithm at D2D pairs to maximize network throughput, while guaranteeing both D2D users' and cellular users' (CUs) Quality of Service (QoS) under the dynamic wireless channel environment. To track and evaluate the performance of the proposed online algorithm, we define three metrics, i.e., D2D collision probability, D2D access rate and time-average network throughput. The simulation results confirm the convergence property of the proposed algorithm and shows improved performance in terms of three defined metrics as compared to the celebrated Q-learning-based method.
KW - D2D communication
KW - deep reinforcement learning
KW - power control
UR - http://www.scopus.com/inward/record.url?scp=85115727532&partnerID=8YFLogxK
U2 - 10.1109/ICC42927.2021.9501055
DO - 10.1109/ICC42927.2021.9501055
M3 - Conference paper
AN - SCOPUS:85115727532
T3 - IEEE International Conference on Communications
BT - ICC 2021 - IEEE International Conference on Communications, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE International Conference on Communications, ICC 2021
Y2 - 14 June 2021 through 23 June 2021
ER -