SOTAVerified

L2 Regularization

See Weight Decay.

$L_{2}$ Regularization or Weight Decay, is a regularization technique applied to the weights of a neural network. We minimize a loss function compromising both the primary loss function and a penalty on the $L_{2}$ Norm of the weights:

$$L_{new}\left(w\right) = L_{original}\left(w\right) + \lambda{w^{T}w}$$

where $\lambda$ is a value determining the strength of the penalty (encouraging smaller weights).

Weight decay can be incorporated directly into the weight update rule, rather than just implicitly by defining it through to objective function. Often weight decay refers to the implementation where we specify it directly in the weight update rule (whereas L2 regularization is usually the implementation which is specified in the objective function).

Papers

Showing 101125 of 128 papers

TitleStatusHype
A New Angle on L2 Regularization0
Collaboratively Weighting Deep and Classic Representation via L2 Regularization for Image ClassificationCode0
Deep Learning of Nonnegativity-Constrained Autoencoders for Enhanced Understanding of Data0
Attentive Recurrent Tensor Model for Community Question Answering0
Achieving Strong Regularization for Deep Neural Networks0
Automatic Parameter Tying in Neural Networks0
Pricing Football Players using Neural Networks0
Data Fusion on Motion and Magnetic Sensors embedded on Mobile Devices for the Identification of Activities of Daily Living0
Compressing Low Precision Deep Neural Networks Using Sparsity-Induced Regularization in Ternary Networks0
Revisiting Activation Regularization for Language RNNs0
Regularization techniques for fine-tuning in neural machine translation0
Attention-Based End-to-End Speech Recognition on Voice Search0
L2 Regularization versus Batch and Weight Normalization0
Convolutional Neural Networks for Facial Expression RecognitionCode0
Large Scale Evolution of Convolutional Neural Networks Using Volunteer Computing0
Neurogenesis-Inspired Dictionary Learning: Online Model Adaption in a Changing WorldCode0
Robust method for finding sparse solutions to linear inverse problems using an L2 regularization0
On Regularization Parameter Estimation under Covariate ShiftCode0
Feature Representation for ICU Mortality0
Towards a Better Understanding of Predict and Count Models0
To Drop or Not to Drop: Robustness, Consistency and Differential Privacy Properties of Dropout0
A Bayesian encourages dropout0
Automatic Discovery and Optimization of Parts for Image Classification0
Action Classification with Locality-constrained Linear Coding0
An efficient distributed learning algorithm based on effective local functional approximations0
Show:102550
← PrevPage 5 of 6Next →

No leaderboard results yet.