SOTAVerified

A Fully First-Order Layer for Differentiable Optimization

2025-12-02Code Available0· sign in to hype

Zihao Zhao, Kai-Chia Mo, Shing-Hei Ho, Brandon Amos, Kai Wang

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Differentiable optimization layers enable learning systems to make decisions by solving embedded optimization problems. However, computing gradients via implicit differentiation requires solving a linear system with Hessian terms, which is both compute- and memory-intensive. To address this challenge, we propose a novel algorithm that computes the gradient using only first-order information. The key insight is to rewrite the differentiable optimization as a bilevel optimization problem and leverage recent advances in bilevel methods. Specifically, we introduce an active-set Lagrangian hypergradient oracle that avoids Hessian evaluations and provides finite-time, non-asymptotic approximation guarantees. We show that an approximate hypergradient can be computed using only first-order information in (1) time, leading to an overall complexity of (δ^-1ε^-3) for constrained bilevel optimization, which matches the best known rate for non-smooth non-convex optimization. Furthermore, we release an open-source Python library that can be easily adapted from existing solvers. Our code is available here: https://github.com/guaguakai/FFOLayer.

Reproductions