Training Safe Neural Networks with Global SDP Bounds
Roman Soletskyi, David "davidad" Dalrymple
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
This paper presents a novel approach to training neural networks with formal safety guarantees using semidefinite programming (SDP) for verification. Our method focuses on verifying safety over large, high-dimensional input regions, addressing limitations of existing techniques that focus on adversarial robustness bounds. We introduce an ADMM-based training scheme for an accurate neural network classifier on the Adversarial Spheres dataset, achieving provably perfect recall with input dimensions up to d=40. This work advances the development of reliable neural network verification methods for high-dimensional systems, with potential applications in safe RL policies.