Efficient Online Learning for Cognitive Radar-Cellular Coexistence via Contextual Thompson Sampling

2020-08-24Unverified0· sign in to hype

Charles E. Thornton, R. Michael Buehrer, Anthony F. Martone

Unverified — Be the first to reproduce this paper.

Abstract

This paper describes a sequential, or online, learning scheme for adaptive radar transmissions that facilitate spectrum sharing with a non-cooperative cellular network. First, the interference channel between the radar and a spatially distant cellular network is modeled. Then, a linear Contextual Bandit (CB) learning framework is applied to drive the radar's behavior. The fundamental trade-off between exploration and exploitation is balanced by a proposed Thompson Sampling (TS) algorithm, a pseudo-Bayesian approach which selects waveform parameters based on the posterior probability that a specific waveform is optimal, given discounted channel information as context. It is shown that the contextual TS approach converges more rapidly to behavior that minimizes mutual interference and maximizes spectrum utilization than comparable contextual bandit algorithms. Additionally, we show that the TS learning scheme results in a favorable SINR distribution compared to other online learning algorithms. Finally, the proposed TS algorithm is compared to a deep reinforcement learning model. We show that the TS algorithm maintains competitive performance with a more complex Deep Q-Network (DQN).

Tasks

Deep Reinforcement Learning Thompson Sampling

Efficient Online Learning for Cognitive Radar-Cellular Coexistence via Contextual Thompson Sampling

Abstract

Tasks

Reproductions