King's College London

Research portal

Intelligent Trajectory Planning in UAV-mounted Wireless Networks: A Quantum-Inspired Reinforcement Learning Perspective

Research output: Contribution to journalArticlepeer-review

Original languageEnglish
Article number9456900
Pages (from-to)1994-1998
Number of pages5
JournalIEEE Wireless Communications Letters
Volume10
Issue number9
Early online date16 Jun 2021
DOIs
E-pub ahead of print16 Jun 2021
PublishedSep 2021

Bibliographical note

Funding Information: Manuscript received May 12, 2021; accepted June 8, 2021. Date of publication June 16, 2021; date of current version September 9, 2021. This work was supported by the K-CSC scholarship funded jointly by China Scholarship Council and King’s College London, under Grant CSC201908350102. The associate editor coordinating the review of this article and approving it for publication was Z. Chang. (Corresponding author: Yuanjian Li.) Yuanjian Li and A. Hamid Aghvami are with the Centre for Telecommunications Research, King’s College London, London WC2R 2LS, U.K. (e-mail: yuanjian.li@kcl.ac.uk; hamid.aghvami@kcl.ac.uk). Publisher Copyright: © 2012 IEEE. Copyright: Copyright 2021 Elsevier B.V., All rights reserved.

King's Authors

Abstract

In this letter, we consider a wireless uplink transmission scenario in which an unmanned aerial vehicle (UAV) serves as an aerial base station collecting data from ground users. To optimize the expected sum uplink transmit rate without any prior knowledge of ground users (e.g., locations, channel state information and transmit power), the trajectory planning problem is optimized via the quantum-inspired reinforcement learning (QiRL) approach. Specifically, the QiRL method adopts novel probabilistic action selection policy and new reinforcement strategy, which are inspired by the collapse phenomenon and amplitude amplification in quantum computation theory, respectively. Numerical results demonstrate that the proposed QiRL solution can offer natural balancing between exploration and exploitation via ranking collapse probabilities of possible actions, compared to the traditional reinforcement learning approaches that are highly dependent on tuned exploration parameters.

View graph of relations

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454