Data Augmentation

Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting.

Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1401–1450 of 8378 papers

Title	Date	Tasks	Status	Hype	Score
PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation	May 6, 2021	3D Human Pose EstimationData Augmentation	CodeCode Available	1	5
Pose-disentangled Contrastive Learning for Self-supervised Facial Representation	Nov 24, 2022	Contrastive LearningData Augmentation	CodeCode Available	1	5
NCAGC: A Neighborhood Contrast Framework for Attributed Graph Clustering	Jun 16, 2022	ClusteringContrastive Learning	CodeCode Available	1	5
Pretraining Representations for Bioacoustic Few-shot Detection using Supervised Contrastive Learning	Sep 2, 2023	Contrastive LearningData Augmentation	CodeCode Available	1	5
Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation	Jan 21, 2022	ClassificationContrastive Learning	CodeCode Available	1	5
PRIME: A few primitives can boost robustness to common corruptions	Dec 27, 2021	Computational EfficiencyData Augmentation	CodeCode Available	1	5
ID-Blau: Image Deblurring by Implicit Diffusion-based reBLurring AUgmentation	Dec 18, 2023	Data AugmentationDeblurring	CodeCode Available	1	5
AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models	Jun 24, 2024	BenchmarkingData Augmentation	CodeCode Available	1	5
AutoDC: Automated data-centric processing	Nov 23, 2021	AutoMLData Augmentation	CodeCode Available	1	5
EEG-Inception: An Accurate and Robust End-to-End Neural Network for EEG-based Motor Imagery Classification	Jan 24, 2021	Brain Computer InterfaceClassification	CodeCode Available	1	5
Acoustic echo cancellation with the dual-signal transformation LSTM network	Oct 27, 2020	Acoustic echo cancellationData Augmentation	CodeCode Available	1	5
Easter2.0: Improving convolutional models for handwritten text recognition	May 30, 2022	Data AugmentationFew-Shot Learning	CodeCode Available	1	5
EC-GAN: Low-Sample Classification using Semi-Supervised Algorithms and GANs	Dec 26, 2020	ClassificationData Augmentation	CodeCode Available	1	5
PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation	Oct 22, 2023	Data AugmentationLanguage Modeling	CodeCode Available	1	5
IDA: Improved Data Augmentation Applied to Salient Object Detection	Sep 18, 2020	Data AugmentationImage Cropping	CodeCode Available	1	5
EEG-GAN: Generative adversarial networks for electroencephalograhic (EEG) brain signals	Jun 5, 2018	Data AugmentationEEG	CodeCode Available	1	5
ECNU-SenseMaker at SemEval-2020 Task 4: Leveraging Heterogeneous Knowledge Resources for Commonsense Validation and Explanation	Jul 28, 2020	Data AugmentationGraph Attention	CodeCode Available	1	5
Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels	Apr 28, 2020	AllAtari Games 100k	CodeCode Available	1	5
Hyperspectral Image Super-Resolution with Spectral Mixup and Heterogeneous Datasets	Jan 19, 2021	Data AugmentationHyperspectral Image Super-Resolution	CodeCode Available	1	5
Overcoming challenges in leveraging GANs for few-shot data augmentation	Mar 30, 2022	ClassificationData Augmentation	CodeCode Available	1	5
A Probabilistic Framework for Knowledge Graph Data Augmentation	Oct 25, 2021	Data AugmentationKnowledge Graph Completion	CodeCode Available	1	5
HyperTab: Hypernetwork Approach for Deep Learning on Small Tabular Datasets	Apr 7, 2023	Data AugmentationDeep Learning	CodeCode Available	1	5
Capturing complex hand movements and object interactions using machine learning-powered stretchable smart textile gloves	Oct 3, 2024	Data Augmentation	CodeCode Available	1	5
AADG: Automatic Augmentation for Domain Generalization on Retinal Image Segmentation	Jul 27, 2022	Data AugmentationDeep Reinforcement Learning	CodeCode Available	1	5
Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective	Dec 8, 2023	Cross-Modal RetrievalData Augmentation	CodeCode Available	1	5
EEG Synthetic Data Generation Using Probabilistic Diffusion Models	Mar 6, 2023	Brain Computer InterfaceData Augmentation	CodeCode Available	1	5
AutoCLINT: The Winning Method in AutoCV Challenge 2019	May 9, 2020	BIG-bench Machine LearningData Augmentation	CodeCode Available	1	5
AutoBalance: Optimized Loss Functions for Imbalanced Data	Jan 4, 2022	Data AugmentationFairness	CodeCode Available	1	5
Effective Pre-Training of Audio Transformers for Sound Event Detection	Sep 14, 2024	Data AugmentationEvent Detection	CodeCode Available	1	5
RAD: A Dataset and Benchmark for Real-Life Anomaly Detection with Robotic Observations	Oct 1, 2024	Anomaly DetectionData Augmentation	CodeCode Available	1	5
HybridAugment++: Unified Frequency Spectra Perturbations for Model Robustness	Jul 21, 2023	Adversarial RobustnessData Augmentation	CodeCode Available	1	5
CarveMix: A Simple Data Augmentation Method for Brain Lesion Segmentation	Aug 16, 2021	Data AugmentationLesion Segmentation	CodeCode Available	1	5
HypMix: Hyperbolic Interpolative Data Augmentation	Nov 1, 2021	Adversarial RobustnessData Augmentation	CodeCode Available	1	5
Raindrops on Windshield: Dataset and Lightweight Gradient-Based Detection Algorithm	Apr 11, 2021	Autonomous VehiclesData Augmentation	CodeCode Available	1	5
EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation	Sep 15, 2021	Data AugmentationKnowledge Distillation	CodeCode Available	1	5
Cascaded deep monocular 3D human pose estimation with evolutionary training data	Jun 14, 2020	3D Human Pose EstimationData Augmentation	CodeCode Available	1	5
EfficientDeRain: Learning Pixel-wise Dilation Filtering for High-Efficiency Single-Image Deraining	Sep 19, 2020	Data AugmentationRain Removal	CodeCode Available	1	5
Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning	Sep 10, 2021	Continual PretrainingContrastive Learning	CodeCode Available	1	5
ChimeraMix: Image Classification on Small Datasets via Masked Feature Mixing	Feb 23, 2022	ClassificationData Augmentation	CodeCode Available	1	5
Efficiently Modeling Long Sequences with Structured State Spaces	Oct 31, 2021	Data AugmentationLanguage Modeling	CodeCode Available	1	5
AAPL: Adding Attributes to Prompt Learning for Vision-Language Models	Apr 25, 2024	Data AugmentationDomain Generalization	CodeCode Available	1	5
End-to-end lyrics Recognition with Voice to Singing Style Transfer	Feb 17, 2021	Data AugmentationLanguage Modeling	CodeCode Available	1	5
How to trust unlabeled data? Instance Credibility Inference for Few-Shot Learning	Jul 15, 2020	Data AugmentationFew-Shot Learning	CodeCode Available	1	5
Efficient Model for Image Classification With Regularization Tricks	Feb 1, 2020	ClassificationData Augmentation	CodeCode Available	1	5
HRSAM: Efficient Interactive Segmentation in High-Resolution Images	Jul 2, 2024	Data AugmentationGPU	CodeCode Available	1	5
CCMNet: Leveraging Calibrated Color Correction Matrices for Cross-Camera Color Constancy	Apr 10, 2025	Color ConstancyData Augmentation	CodeCode Available	1	5
CellMix: A General Instance Relationship based Method for Data Augmentation Towards Pathology Image Classification	Jan 27, 2023	Data Augmentationimage-classification	CodeCode Available	1	5
Causal Action Influence Aware Counterfactual Data Augmentation	May 29, 2024	counterfactualCounterfactual Reasoning	CodeCode Available	1	5
Entailment as Few-Shot Learner	Apr 29, 2021	Contrastive LearningData Augmentation	CodeCode Available	1	5
CCGL: Contrastive Cascade Graph Learning	Jul 27, 2021	Data AugmentationGraph Learning	CodeCode Available	1	5

Show:10 25 50

← PrevPage 29 of 168Next →

All datasets ImageNet CIFAR-10 GA1457

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	DeiT-B (+MixPro)	Accuracy (%)	82.9	—	Unverified
2	ResNet-200 (DeepAA)	Accuracy (%)	81.32	—	Unverified
3	DeiT-S (+MixPro)	Accuracy (%)	81.3	—	Unverified
4	ResNet-200 (Fast AA)	Accuracy (%)	80.6	—	Unverified
5	ResNet-200 (UA)	Accuracy (%)	80.4	—	Unverified
6	ResNet-200 (AA)	Accuracy (%)	80	—	Unverified
7	ResNet-50 (DeepAA)	Accuracy (%)	78.3	—	Unverified
8	ResNet-50 (TA wide)	Accuracy (%)	78.07	—	Unverified
9	ResNet-50 (LoRot-E)	Accuracy (%)	77.72	—	Unverified
10	ResNet-50 (LoRot-I)	Accuracy (%)	77.71	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WideResNet-40-2 (Faster AA)	Percentage error	3.7	—	Unverified
2	Shake-Shake (26 2×32d) (Faster AA)	Percentage error	2.7	—	Unverified
3	WideResNet-28-10 (Faster AA)	Percentage error	2.6	—	Unverified
4	Shake-Shake (26 2×112d) (Faster AA)	Percentage error	2	—	Unverified
5	Shake-Shake (26 2×96d) (Faster AA)	Percentage error	2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DiffAug	Classification Accuracy	92.7	—	Unverified
2	PaCMAP	Classification Accuracy	85.3	—	Unverified
3	hNNE	Classification Accuracy	77.4	—	Unverified
4	TopoAE	Classification Accuracy	74.6	—	Unverified