SOTAVerified

Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding

2024-05-28Code Available0· sign in to hype

Daniel Bethell, Simos Gerasimou, Radu Calinescu, Calum Imrie

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Empowering safe exploration of reinforcement learning (RL) agents during training is a critical challenge towards their deployment in many real-world scenarios. When prior knowledge of the domain or task is unavailable, training RL agents in unknown, black-box environments presents an even greater safety risk. We introduce ADVICE (Adaptive Shielding with a Contrastive Autoencoder), a novel post-shielding technique that distinguishes safe and unsafe features of state-action pairs during training, and uses this knowledge to protect the RL agent from executing actions that yield likely hazardous outcomes. Our comprehensive experimental evaluation against state-of-the-art safe RL exploration techniques shows that ADVICE significantly reduces safety violations (\!\!50\%) during training, with a competitive outcome reward compared to other techniques.

Tasks

Reproductions