Comparisons Are All You Need for Optimizing Smooth Functions
Chenyi Zhang, Tongyang Li
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
When optimizing machine learning models, there are various scenarios where gradient computations are challenging or even infeasible. Furthermore, in reinforcement learning (RL), preference-based RL that only compares between options has wide applications, including reinforcement learning with human feedback in large language models. In this paper, we systematically study optimization of a smooth function fR^nR only assuming an oracle that compares function values at two points and tells which is larger. When f is convex, we give two algorithms using O(n/) and O(n^2) comparison queries to find an -optimal solution, respectively. When f is nonconvex, our algorithm uses O(n/^2) comparison queries to find an -approximate stationary point. All these results match the best-known zeroth-order algorithms with function evaluation queries in n dependence, thus suggest that comparisons are all you need for optimizing smooth functions using derivative-free methods. In addition, we also give an algorithm for escaping saddle points and reaching an -second order stationary point of a nonconvex f, using O(n^1.5/^2.5) comparison queries.