SOTAVerified

L2 Regularization

See Weight Decay.

$L_{2}$ Regularization or Weight Decay, is a regularization technique applied to the weights of a neural network. We minimize a loss function compromising both the primary loss function and a penalty on the $L_{2}$ Norm of the weights:

$$L_{new}\left(w\right) = L_{original}\left(w\right) + \lambda{w^{T}w}$$

where $\lambda$ is a value determining the strength of the penalty (encouraging smaller weights).

Weight decay can be incorporated directly into the weight update rule, rather than just implicitly by defining it through to objective function. Often weight decay refers to the implementation where we specify it directly in the weight update rule (whereas L2 regularization is usually the implementation which is specified in the objective function).

Papers

Showing 5175 of 128 papers

TitleStatusHype
Globally Gated Deep Linear Networks0
Linking Neural Collapse and L2 Normalization with Improved Out-of-Distribution Detection in Deep Neural Networks0
On the utility and protection of optimization with differential privacy and classic regularization techniques0
Perturbation of Deep Autoencoder Weights for Model Compression and Classification of Tabular Data0
Guidelines for the Regularization of Gammas in Batch Normalization for Deep Residual Networks0
A Note on the Regularity of Images Generated by Convolutional Neural Networks0
A Closer Look at Rehearsal-Free Continual Learning0
How Infinitely Wide Neural Networks Can Benefit from Multi-task Learning -- an Exact Macroscopic CharacterizationCode0
Probabilistic fine-tuning of pruning masks and PAC-Bayes self-bounded learning0
Disturbing Target Values for Neural Network RegularizationCode0
Regularized Training of Nearest Neighbor Language Models0
Sequence Length is a Domain: Length-based Overfitting in Transformer ModelsCode0
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity0
Guiding Teacher Forcing with Seer Forcing for Neural Machine Translation0
The Limitations of Large Width in Neural Networks: A Deep Gaussian Process PerspectiveCode0
Learning with Hyperspherical UniformityCode0
Effect of the regularization hyperparameter on deep learning-based segmentation in LGE-MRI0
Gram Regularization for Multi-view 3D Shape Retrieval0
Exponentially Weighted l_2 Regularization Strategy in Constructing Reinforced Second-order Fuzzy Rule-based Model0
An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning0
A Bayesian traction force microscopy method with automated denoising in a user-friendly software package0
Data-dependent Gaussian Prior Objective for Language Generation0
Correlated Initialization for Correlated Data0
Tighter Bound Estimation of Sensitivity Analysis for Incremental and Decremental Data Modification0
Regularisation Can Mitigate Poisoning Attacks: A Novel Analysis Based on Multiobjective Bilevel Optimisation0
Show:102550
← PrevPage 3 of 6Next →

No leaderboard results yet.