SOTAVerified

Stochastic Optimization

Stochastic Optimization is the task of optimizing certain objective functional by generating and using stochastic random variables. Usually the Stochastic Optimization is an iterative process of generating random variables that progressively finds out the minima or the maxima of the objective functional. Stochastic Optimization is usually applied in the non-convex functional spaces where the usual deterministic optimization such as linear or quadratic programming or their variants cannot be used.

Source: ASOC: An Adaptive Parameter-free Stochastic Optimization Techinique for Continuous Variables

Papers

Showing 51100 of 1387 papers

TitleStatusHype
Stochastic Gradient Descent Captures How Children Learn About PhysicsCode1
Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep NetworksCode1
Efficient approximation of Jacobian matrices involving a non-uniform fast Fourier transform (NUFFT)Code1
Stochastic Hyperparameter Optimization through HypernetworksCode1
An Analysis of the Adaptation Speed of Causal ModelsCode1
The Acquisition of Physical Knowledge in Generative Neural NetworksCode1
Training Deep Networks without Learning Rates Through Coin BettingCode1
Training-free Diffusion Model Alignment with Sampling DemonsCode1
Adaptive Semantic Token Communication for Transformer-based Edge InferenceCode1
Cyclical Stochastic Gradient MCMC for Bayesian Deep LearningCode1
Why Do We Need Weight Decay in Modern Deep Learning?Code1
A Better Alternative to Error Feedback for Communication-Efficient Distributed LearningCode1
Differentiable Quality DiversityCode1
End-to-End Stochastic Optimization with Energy-Based ModelCode1
Distributionally Robust Neural NetworksCode1
Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case GeneralizationCode1
Exploiting Explainable Metrics for Augmented SGDCode1
Federated Learning over Wireless Networks: Convergence Analysis and Resource AllocationCode1
A Novel Unified Parametric Assumption for Nonconvex OptimizationCode1
Apollo: An Adaptive Parameter-wise Diagonal Quasi-Newton Method for Nonconvex Stochastic OptimizationCode1
Adaptivity of Stochastic Gradient Methods for Nonconvex OptimizationCode1
Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic OptimizationCode1
Learning from History for Byzantine Robust OptimizationCode1
Lookahead Optimizer: k steps forward, 1 step backCode1
ATD: Augmenting CP Tensor Decomposition by Self SupervisionCode1
Monte Carlo Policy Gradient Method for Binary OptimizationCode1
Online Learning Rate Adaptation with Hypergradient DescentCode1
On the Variance of the Adaptive Learning Rate and BeyondCode1
Adafactor: Adaptive Learning Rates with Sublinear Memory CostCode1
ADMM for Efficient Deep Learning with Global ConvergenceCode1
JaxSGMC: Modular stochastic gradient MCMC in JAXCode1
Averaging Weights Leads to Wider Optima and Better GeneralizationCode1
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine LearningCode1
PACOH: Bayes-Optimal Meta-Learning with PAC-GuaranteesCode1
Adam: A Method for Stochastic OptimizationCode1
BCD Nets: Scalable Variational Approaches for Bayesian Causal DiscoveryCode1
Bi-level Score Matching for Learning Energy-based Latent Variable ModelsCode1
Quality-Diversity Optimization: a novel branch of stochastic optimizationCode1
BinaryViT: Pushing Binary Vision Transformers Towards Convolutional ModelsCode1
Randomized Physics-Informed Neural Networks for Bayesian Data AssimilationCode1
Adapting to Mixing Time in Stochastic Optimization with Markovian DataCode1
Reinforcement Learning with Dynamic Convex Risk MeasuresCode1
Shampoo: Preconditioned Stochastic Tensor OptimizationCode1
A Framework for Improving the Reliability of Black-box Variational InferenceCode1
Decentralized Stochastic Optimization and Gossip Algorithms with Compressed CommunicationCode0
DADAM: A Consensus-based Distributed Adaptive Gradient Method for Online OptimizationCode0
Decision-Dependent Stochastic Optimization: The Role of Distribution DynamicsCode0
ACMo: Angle-Calibrated Moment Methods for Stochastic OptimizationCode0
CProp: Adaptive Learning Rate Scaling from Past Gradient ConformityCode0
Coupling Adaptive Batch Sizes with Learning RatesCode0
Show:102550
← PrevPage 2 of 28Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1AvaGradAccuracy81.24Unverified
2AdaShiftAccuracy81.12Unverified
3Adam (eps-adjusted)Accuracy81.04Unverified
4SGDAccuracy80.95Unverified
5AdamWAccuracy79.87Unverified
6AdaBoundAccuracy77.24Unverified
#ModelMetricClaimedVerifiedStatus
1Adam (eps-adjusted)Accuracy96.36Unverified
2AvaGradAccuracy96.2Unverified
3SGDAccuracy96.14Unverified
4AdaShiftAccuracy95.92Unverified
5AdamWAccuracy95.89Unverified
6AdaBoundAccuracy94.6Unverified
#ModelMetricClaimedVerifiedStatus
1SGD - cosine LR scheduleAccuracy95.55Unverified
2LookaheadAccuracy95.27Unverified
3SGDAccuracy95.23Unverified
4ADAMAccuracy94.84Unverified
#ModelMetricClaimedVerifiedStatus
1AvaGradTop 1 Accuracy76.51Unverified
2SGDTop 1 Accuracy75.99Unverified
3AdamWTop 1 Accuracy72.9Unverified
4AdaBoundTop 1 Accuracy72.01Unverified
#ModelMetricClaimedVerifiedStatus
1AdaBoundBit per Character (BPC)2.86Unverified
2AdaShiftBit per Character (BPC)1.27Unverified
3AdamWBit per Character (BPC)1.23Unverified
4AvaGradBit per Character (BPC)1.18Unverified
#ModelMetricClaimedVerifiedStatus
1Resnet18Accuracy (max)86.85Unverified
2Resnet34Accuracy (max)86.14Unverified
#ModelMetricClaimedVerifiedStatus
1Resnet18Accuracy (max)58.48Unverified
2Resnet34Accuracy (max)54.5Unverified
#ModelMetricClaimedVerifiedStatus
1SGDTop 5 Accuracy92.15Unverified
2LookaheadTop 1 Accuracy75.13Unverified
#ModelMetricClaimedVerifiedStatus
1LookaheadTop 1 Accuracy75.49Unverified
2SGDTop 1 Accuracy75.15Unverified
#ModelMetricClaimedVerifiedStatus
1BertAccuracy (max)93.99Unverified
#ModelMetricClaimedVerifiedStatus
1BertAccuracy (max)86.34Unverified
#ModelMetricClaimedVerifiedStatus
1MLPNLL0.05Unverified