Abstract
We introduce a reinforcement learning algorithm inspired by the combinatorial multi-armed bandit problem to minimize the time-averaged energy cost at individual base stations (BSs), powered by various energy markets and local renewable energy sources, over a finite-time horizon. The algorithm sustains traffic demands by enabling sparse beamforming to schedule dynamic user-to-BS allocation and proactive energy provisioning at BSs to make ahead-of-time price-aware energy management decisions. Simulation results indicate a superior performance of the proposed algorithm in reducing the overall energy cost, as compared with recently proposed cooperative energy management designs.
Original language | English |
---|---|
Article number | 7887725 |
Pages (from-to) | 1609-1612 |
Number of pages | 4 |
Journal | IEEE COMMUNICATIONS LETTERS |
Volume | 21 |
Issue number | 7 |
Early online date | 27 Mar 2017 |
DOIs | |
Publication status | Published - 1 Jul 2017 |
Keywords
- CMAB
- Energy management
- online learning