SOTAVerified

L2 Regularization

See Weight Decay.

$L_{2}$ Regularization or Weight Decay, is a regularization technique applied to the weights of a neural network. We minimize a loss function compromising both the primary loss function and a penalty on the $L_{2}$ Norm of the weights:

$$L_{new}\left(w\right) = L_{original}\left(w\right) + \lambda{w^{T}w}$$

where $\lambda$ is a value determining the strength of the penalty (encouraging smaller weights).

Weight decay can be incorporated directly into the weight update rule, rather than just implicitly by defining it through to objective function. Often weight decay refers to the implementation where we specify it directly in the weight update rule (whereas L2 regularization is usually the implementation which is specified in the objective function).

Papers

Showing 150 of 128 papers

TitleStatusHype
Maintaining Plasticity in Deep Continual LearningCode2
The Transient Nature of Emergent In-Context Learning in TransformersCode1
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural NetworksCode1
It's Enough: Relaxing Diagonal Constraints in Linear Autoencoders for RecommendationCode1
Motion Correction and Volumetric Reconstruction for Fetal Functional Magnetic Resonance Imaging DataCode1
Towards Unsupervised Deep Image Enhancement with Generative Adversarial NetworkCode1
Neural Pruning via Growing RegularizationCode1
Label-Only Membership Inference AttacksCode1
Distributionally Robust Neural NetworksCode1
Quantifying Generalization in Reinforcement LearningCode1
Re-evaluating Continual Learning Scenarios: A Categorization and Case for Strong BaselinesCode1
Overcoming catastrophic forgetting in neural networks0
From large-eddy simulations to deep learning: A U-net model for fast urban canopy flow predictionsCode0
DACN: Dual-Attention Convolutional Network for Hyperspectral Image Super-ResolutionCode0
Geometry of Learning -- L2 Phase Transitions in Deep and Shallow Neural Networks0
Understand the Effect of Importance Weighting in Deep Learning on Dataset Shift0
Deep Learning in Renewable Energy Forecasting: A Cross-Dataset Evaluation of Temporal and Spatial Models0
Semantic segmentation for building houses from wooden cubes0
GPT Meets Graphs and KAN Splines: Testing Novel Frameworks on Multitask Fine-Tuned GPT-2 with LoRA0
CtrTab: Tabular Data Synthesis with High-Dimensional and Limited Data0
Low-rank bias, weight decay, and model merging in neural networks0
Multimodal Bearing Fault Classification Under Variable Conditions: A 1D CNN with Transfer Learning0
Renewable Energy Prediction: A Comparative Study of Deep Learning Models for Complex Dataset Analysis0
Learning in Log-Domain: Subthreshold Analog AI Accelerator Based on Stochastic Gradient Descent0
Super-Resolution for Remote Sensing Imagery via the Coupling of a Variational Model and Deep Learning0
Parkinson's Disease Diagnosis Through Deep Learning: A Novel LSTM-Based Approach for Freezing of Gait Detection0
Effectiveness of L2 Regularization in Privacy-Preserving Machine Learning0
Analysis of High-dimensional Gaussian Labeled-unlabeled Mixture Model via Message-passing Algorithm0
Recurrent Stochastic Configuration Networks with Hybrid Regularization for Nonlinear Dynamics Modelling0
Carbon price fluctuation prediction using blockchain information A new hybrid machine learning approach0
Weight decay induces low-rank attention layers0
WALINET: A water and lipid identification convolutional Neural Network for nuisance signal removal in 1H MR Spectroscopic ImagingCode0
Rethinking Conventional Wisdom in Machine Learning: From Generalization to Scaling0
Training Dynamics of Nonlinear Contrastive Learning Model in the High Dimensional Limit0
Comparative Study of Bitcoin Price Prediction0
Derivative-based regularization for regression0
Monkeypox disease recognition model based on improved SE-InceptionV3Code0
Convergence of a L2 regularized Policy Gradient Algorithm for the Multi Armed BanditCode0
An Experiment on Feature Selection using Logistic Regression0
Prevalidated ridge regression is a highly-efficient drop-in replacement for logistic regression for high-dimensional dataCode0
Reverse Engineering Deep ReLU Networks An Optimization-based Algorithm0
Gradient-based bilevel optimization for multi-penalty Ridge regression through matrix differential calculusCode0
On sparse regression, Lp-regularization, and automated model discovery0
Maintaining Plasticity in Continual Learning via Regenerative Regularization0
Less is More -- Towards parsimonious multi-task models using structured sparsityCode0
Dropout Regularization Versus _2-Penalization in the Linear Model0
Electromyography Signal Classification Using Deep Learning0
Maximum margin learning of t-SPNs for cell classification with filtered input0
Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End Speech Recognition0
Planting and Mitigating Memorized Content in Predictive-Text Language ModelsCode0
Show:102550
← PrevPage 1 of 3Next →

No leaderboard results yet.