SOTAVerified

L2 Regularization

See Weight Decay.

$L_{2}$ Regularization or Weight Decay, is a regularization technique applied to the weights of a neural network. We minimize a loss function compromising both the primary loss function and a penalty on the $L_{2}$ Norm of the weights:

$$L_{new}\left(w\right) = L_{original}\left(w\right) + \lambda{w^{T}w}$$

where $\lambda$ is a value determining the strength of the penalty (encouraging smaller weights).

Weight decay can be incorporated directly into the weight update rule, rather than just implicitly by defining it through to objective function. Often weight decay refers to the implementation where we specify it directly in the weight update rule (whereas L2 regularization is usually the implementation which is specified in the objective function).

Papers

Showing 51100 of 128 papers

TitleStatusHype
Gram Regularization for Multi-view 3D Shape Retrieval0
Guidelines for the Regularization of Gammas in Batch Normalization for Deep Residual Networks0
Guiding Teacher Forcing with Seer Forcing for Neural Machine Translation0
Implicit Filter Sparsification In Convolutional Neural Networks0
Linking Neural Collapse and L2 Normalization with Improved Out-of-Distribution Detection in Deep Neural Networks0
L2 Regularization versus Batch and Weight Normalization0
Large Scale Evolution of Convolutional Neural Networks Using Volunteer Computing0
Learning in Log-Domain: Subthreshold Analog AI Accelerator Based on Stochastic Gradient Descent0
Learning Sparse Low-Precision Neural Networks With Learnable Regularization0
Recurrent Stochastic Configuration Networks with Hybrid Regularization for Nonlinear Dynamics Modelling0
A Bayesian encourages dropout0
A Bayesian traction force microscopy method with automated denoising in a user-friendly software package0
Achieving Strong Regularization for Deep Neural Networks0
A Closer Look at Rehearsal-Free Continual Learning0
A Comparative Study of Neural Network Compression0
Action Classification with Locality-constrained Linear Coding0
Adaptive Estimators Show Information Compression in Deep Neural Networks0
A MAX-AFFINE SPLINE PERSPECTIVE OF RECURRENT NEURAL NETWORKS0
Analysis of High-dimensional Gaussian Labeled-unlabeled Mixture Model via Message-passing Algorithm0
Analysis of overfitting in the regularized Cox model0
Regularization techniques for fine-tuning in neural machine translation0
Regularized Policy Iteration0
Regularized Training of Nearest Neighbor Language Models0
Renewable Energy Prediction: A Comparative Study of Deep Learning Models for Complex Dataset Analysis0
Rethinking Conventional Wisdom in Machine Learning: From Generalization to Scaling0
Reverse Engineering Deep ReLU Networks An Optimization-based Algorithm0
Revisiting Activation Regularization for Language RNNs0
Robust method for finding sparse solutions to linear inverse problems using an L2 regularization0
Self-Distillation Amplifies Regularization in Hilbert Space0
Semantic segmentation for building houses from wooden cubes0
Improved error rates for sparse (group) learning with Lipschitz loss functions0
Super-Resolution for Remote Sensing Imagery via the Coupling of a Variational Model and Deep Learning0
The Ant Swarm Neuro-Evolution Procedure for Optimizing Recurrent Networks0
The Theory Behind Overfitting, Cross Validation, Regularization, Bagging, and Boosting: Tutorial0
Tighter Bound Estimation of Sensitivity Analysis for Incremental and Decremental Data Modification0
Tight Sample Complexity of Large-Margin Learning0
To Drop or Not to Drop: Robustness, Consistency and Differential Privacy Properties of Dropout0
Towards a Better Understanding of Predict and Count Models0
Training Dynamics of Nonlinear Contrastive Learning Model in the High Dimensional Limit0
Understand the Effect of Importance Weighting in Deep Learning on Dataset Shift0
Unsupervised Video Depth Estimation Based on Ego-motion and Disparity Consensus0
Weight decay induces low-rank attention layers0
Low-rank bias, weight decay, and model merging in neural networks0
Maintaining Plasticity in Continual Learning via Regenerative Regularization0
Maximum margin learning of t-SPNs for cell classification with filtered input0
Multi-branch fusion network for hyperspectral image classification0
Multimodal Bearing Fault Classification Under Variable Conditions: A 1D CNN with Transfer Learning0
On Implicit Filter Level Sparsity in Convolutional Neural Networks0
On sparse regression, Lp-regularization, and automated model discovery0
On the utility and protection of optimization with differential privacy and classic regularization techniques0
Show:102550
← PrevPage 2 of 3Next →

No leaderboard results yet.