Multi-Task Deep Reinforcement Learning for Terahertz NOMA Resource Allocation With Hybrid Discrete and Continuous Actions

Zhifeng Hu, Chong Han, Yansha Deng, Xudong Wang

Research output: Contribution to journalArticlepeer-review

Abstract

Terahertz (THz) non-orthogonal multiple access (NOMA) networks have great potential for next-generation wireless communications, by providing promising ultra-high data rates and user fairness. In THz-NOMA networks, efficient and effective long-term beamforming-bandwidth-power (BBP) allocation is yet an open problem due to its non-deterministic polynomial-time hard (NP-hard) nature. In this paper, the continuous property of power and sub-arrays ratios assignment and the discrete property of sub-bands allocation are carefully treated. In light of these attributes, an offline hybrid discrete and continuous actions (DISCO) multi-task deep reinforcement learning (DRL) algorithm is proposed to maximize the long-term throughput. Specifically, the deployment of multi-task learning enables the actor of DISCO to smartly integrate two state-of-the-art DRL algorithms, e.g., actor-critic (AC) that only selects discrete actions and deep deterministic policy gradient (DDPG) that only generates continuous actions. Rigorous theoretical derivations for the neural network design and backpropagation process are provided to tailor our proposed DISCO for the BBP problem. Compared to the benchmark no-learning and conventional DRL algorithms, DISCO enhances the network throughput, while achieving good fairness among users. Furthermore, DISCO consumes hundred-of-millisecond computational time, revealing the practicability of DISCO.

Original languageEnglish
Pages (from-to)1-16
Number of pages16
JournalIEEE Transactions on Vehicular Technology
DOIs
Publication statusAccepted/In press - 2024

Keywords

  • Deep Reinforcement Learning (DRL)
  • Hybrid power systems
  • Multitasking
  • NOMA
  • Non-Orthogonal Multiple Access (NOMA)
  • Resource management
  • Terahertz (THz) networks
  • Terahertz communications
  • Throughput
  • Wireless communication

Fingerprint

Dive into the research topics of 'Multi-Task Deep Reinforcement Learning for Terahertz NOMA Resource Allocation With Hybrid Discrete and Continuous Actions'. Together they form a unique fingerprint.

Cite this