TY - JOUR
T1 - Deep Reinforcement Learning-Based Grant-Free NOMA Optimization for mURLLC
AU - Liu, Yan
AU - Deng, Yansha
AU - Zhou, Hui
AU - Elkashlan, Maged
AU - Nallanathan, Arumugam
N1 - Funding Information:
This work was supported in part by the Engineering and Physical Sciences Research Council (EPSRC), U.K., under Grant EP/R006466/1 and Grant EP/W004348/1 and in part by the Postgraduate Research and Practice Innovation Program of Jiangsu Province under Grant KYCX17_0785
Publisher Copyright:
© 1972-2012 IEEE.
PY - 2023/3/1
Y1 - 2023/3/1
N2 - Grant-free non-orthogonal multiple access (GF-NOMA) is a potential technique to support massive Ultra-Reliable and Low-Latency Communication (mURLLC) service. However, the dynamic resource configuration in GF-NOMA systems is challenging due to random traffics and collisions, that are unknown at the base station (BS). Meanwhile, joint consideration of the latency and reliability requirements makes the resource configuration of GF-NOMA for mURLLC more complex. To address this problem, we develop a novel learning framework for signature-based GF-NOMA in mURLLC service taking into account the multiple access signature collision, the UE detection, as well as the data decoding procedures for the K-repetition GF and the Proactive GF schemes. The goal of our learning framework is to maximize the long-term average number of successfully served users (UEs) under the latency constraint. We first perform a real-time repetition value configuration based on a double deep Q-Network (DDQN) and then propose a Cooperative Multi-Agent learning technique based DQN (CMA-DQN) to optimize the configuration of both the repetition values and the contention-transmission unit (CTU) numbers. Our results show the superior performance of CMA-DQN over the conventional load estimation-based uplink resource configuration approach (LE-URC) in heavy traffic and demonstrate its capability in dynamically configuring in long term for mURLLC service. In addition, with our learning optimization, the Proactive scheme always outperforms the K-repetition scheme in terms of the number of successfully served UEs, especially under the high backlog traffic scenario.
AB - Grant-free non-orthogonal multiple access (GF-NOMA) is a potential technique to support massive Ultra-Reliable and Low-Latency Communication (mURLLC) service. However, the dynamic resource configuration in GF-NOMA systems is challenging due to random traffics and collisions, that are unknown at the base station (BS). Meanwhile, joint consideration of the latency and reliability requirements makes the resource configuration of GF-NOMA for mURLLC more complex. To address this problem, we develop a novel learning framework for signature-based GF-NOMA in mURLLC service taking into account the multiple access signature collision, the UE detection, as well as the data decoding procedures for the K-repetition GF and the Proactive GF schemes. The goal of our learning framework is to maximize the long-term average number of successfully served users (UEs) under the latency constraint. We first perform a real-time repetition value configuration based on a double deep Q-Network (DDQN) and then propose a Cooperative Multi-Agent learning technique based DQN (CMA-DQN) to optimize the configuration of both the repetition values and the contention-transmission unit (CTU) numbers. Our results show the superior performance of CMA-DQN over the conventional load estimation-based uplink resource configuration approach (LE-URC) in heavy traffic and demonstrate its capability in dynamically configuring in long term for mURLLC service. In addition, with our learning optimization, the Proactive scheme always outperforms the K-repetition scheme in terms of the number of successfully served UEs, especially under the high backlog traffic scenario.
KW - deep reinforcement learning
KW - grant free
KW - mURLLC
KW - NOMA
KW - resource configuration
UR - http://www.scopus.com/inward/record.url?scp=85147307471&partnerID=8YFLogxK
U2 - 10.1109/TCOMM.2023.3238061
DO - 10.1109/TCOMM.2023.3238061
M3 - Article
AN - SCOPUS:85147307471
SN - 0090-6778
VL - 71
SP - 1475
EP - 1490
JO - IEEE TRANSACTIONS ON COMMUNICATIONS
JF - IEEE TRANSACTIONS ON COMMUNICATIONS
IS - 3
ER -