Data Augmentation

Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting.

Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 5101–5150 of 8378 papers

Title	Date	Tasks	Status
Wavesplit: End-to-End Speech Separation by Speaker Clustering	Feb 20, 2020	ClusteringData Augmentation	—Unverified
Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding	Oct 25, 2022	Data AugmentationDialogue Understanding	—Unverified
Weakly Supervised Temporal Sentence Grounding With Uncertainty-Guided Self-Training	Jan 1, 2023	Data AugmentationSentence	—Unverified
Weakly supervised training of deep convolutional neural networks for overhead pedestrian localization in depth fields	Jun 9, 2017	Data AugmentationObject Localization	—Unverified
Weather Classification: A new multi-class dataset, data augmentation approach and comprehensive evaluations of Convolutional Neural Networks	Aug 1, 2018	Data AugmentationGeneral Classification	—Unverified
WeatherFormer: Empowering Global Numerical Weather Forecasting with Space-Time Transformer	Sep 21, 2024	Data AugmentationWeather Forecasting	—Unverified
WeldMon: A Cost-effective Ultrasonic Welding Machine Condition Monitoring System	Aug 5, 2023	ClassificationData Augmentation	—Unverified
WeMix: How to Better Utilize Data Augmentation	Oct 3, 2020	Data Augmentation	—Unverified
WERank: Towards Rank Degradation Prevention for Self-Supervised Learning Using Weight Regularization	Feb 14, 2024	Data AugmentationSelf-Supervised Learning	—Unverified
WeSinger: Data-augmented Singing Voice Synthesis with Auxiliary Losses	Mar 21, 2022	Data AugmentationDecoder	—Unverified
What Affects Learned Equivariance in Deep Image Recognition Models?	Apr 5, 2023	Data AugmentationInductive Bias	—Unverified
What are effective labels for augmented data? Improving robustness with AutoLabel	Jan 1, 2021	Adversarial RobustnessData Augmentation	—Unverified
What Are Effective Labels for Augmented Data? Improving Calibration and Robustness with AutoLabel	Feb 22, 2023	Data Augmentation	—Unverified
What Do Adversarially trained Neural Networks Focus: A Fourier Domain-based Study	Mar 16, 2022	Autonomous DrivingData Augmentation	—Unverified
What do we learn from a large-scale study of pre-trained visual representations in sim and real environments?	Oct 3, 2023	Data Augmentation	—Unverified
What Do You Need for Diverse Trajectory Stitching in Diffusion Planning?	May 23, 2025	Behavioural cloningData Augmentation	—Unverified
What Happened to My Dog in That Network: Unraveling Top-down Generators in Convolutional Neural Networks	Nov 23, 2015	Data AugmentationZero-Shot Learning	—Unverified
What is Holding Back Convnets for Detection?	Aug 12, 2015	Data Augmentationobject-detection	—Unverified
What makes a good data augmentation for few-shot unsupervised image anomaly detection?	Apr 6, 2023	Anomaly DetectionData Augmentation	—Unverified
What Makes Better Augmentation Strategies? Augment Difficult but Not too Different	Sep 29, 2021	Data AugmentationSemantic Similarity	—Unverified
What Makes for Good Views for Contrastive Learning?	May 20, 2020	Contrastive LearningData Augmentation	—Unverified
What Makes for Robust Multi-Modal Models in the Face of Missing Modalities?	Oct 10, 2023	Data Augmentation	—Unverified
What Matters for Active Texture Recognition With Vision-Based Tactile Sensors	Mar 20, 2024	Data Augmentation	—Unverified
What's All the FUSS About Free Universal Sound Separation Data?	Nov 2, 2020	AllData Augmentation	—Unverified
When and How Mixup Improves Calibration	Feb 11, 2021	Data Augmentation	—Unverified
When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation	Nov 16, 2021	Data AugmentationHellaSwag	—Unverified
When Covariate-shifted Data Augmentation Increases Test Error And How to Fix It	Sep 25, 2019	Data Augmentationregression	—Unverified
Does Data Augmentation Improve Generalization in NLP?	Apr 30, 2020	Data AugmentationFairness	—Unverified
When Does Re-initialization Work?	Jun 20, 2022	Data Augmentationimage-classification	—Unverified
When is Multi-task Learning Beneficial for Low-Resource Noisy Code-switched User-generated Algerian Texts?	May 1, 2020	Data AugmentationMulti-Task Learning	—Unverified
WHERE and WHICH: Iterative Debate for Biomedical Synthetic Data Augmentation	Mar 31, 2025	counterfactualData Augmentation	—Unverified
Where is the bottleneck in long-tailed classification?	Sep 29, 2021	ClassificationData Augmentation	—Unverified
Where is the disease? Semi-supervised pseudo-normality synthesis from an abnormal image	Jun 24, 2021	Data AugmentationImage Generation	—Unverified
Where Should I Spend My FLOPS? Efficiency Evaluations of Visual Pre-training Methods	Sep 30, 2022	Computational EfficiencyData Augmentation	—Unverified
Which Student is Best? A Comprehensive Knowledge Distillation Exam for Task-Specific BERT Models	Jan 3, 2022	CPUData Augmentation	—Unverified
Whisper Finetuning on Nepali Language	Nov 19, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Whisper Turns Stronger: Augmenting Wav2Vec 2.0 for Superior ASR in Low-Resource Languages	Dec 31, 2024	Automatic Speech RecognitionData Augmentation	—Unverified
White-box Testing of NLP models with Mask Neuron Coverage	May 10, 2022	Data AugmentationFault Detection	—Unverified
White Light Specular Reflection Data Augmentation for Deep Learning Polyp Detection	May 8, 2025	Data Augmentation	—Unverified
Who is we? Disambiguating the referents of first person plural pronouns in parliamentary debates	May 27, 2022	Data Augmentation	—Unverified
Who Is Your Right Mixup Partner in Positive and Unlabeled Learning	Sep 29, 2021	Data Augmentation	—Unverified
Whole-Slide Mitosis Detection in H&E Breast Histology Using PHH3 as a Reference to Train Distilled Stain-Invariant Convolutional Networks	Aug 17, 2018	Data AugmentationKnowledge Distillation	—Unverified
Why does music source separation benefit from cacophony?	Feb 28, 2024	Data AugmentationMusic Source Separation	—Unverified
Why Pre-trained Models Fail: Feature Entanglement in Multi-modal Depression Detection	Mar 9, 2025	Data AugmentationDepression Detection	—Unverified
WideResNet with Joint Representation Learning and Data Augmentation for Cover Song Identification	Jul 18, 2022	Cover song identificationData Augmentation	—Unverified
Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts	Oct 16, 2023	counterfactualData Augmentation	—Unverified
Winning Amazon KDD Cup'24	Aug 5, 2024	Data AugmentationMultiple-choice	—Unverified
Wireless Channel Aware Data Augmentation Methods for Deep Learning-Based Indoor Localization	Aug 12, 2024	Data AugmentationIndoor Localization	—Unverified
Without Further Ado: Direct and Simultaneous Speech Translation by AppTek in 2021	Aug 1, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
WMD at SemEval-2020 Tasks 7 and 11: Assessing Humor and Propaganda Using Unsupervised Data Augmentation	Dec 1, 2020	Data Augmentation	—Unverified

Show:10 25 50

← PrevPage 103 of 168Next →

All datasets ImageNet CIFAR-10 GA1457

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	DeiT-B (+MixPro)	Accuracy (%)	82.9	—	Unverified
2	ResNet-200 (DeepAA)	Accuracy (%)	81.32	—	Unverified
3	DeiT-S (+MixPro)	Accuracy (%)	81.3	—	Unverified
4	ResNet-200 (Fast AA)	Accuracy (%)	80.6	—	Unverified
5	ResNet-200 (UA)	Accuracy (%)	80.4	—	Unverified
6	ResNet-200 (AA)	Accuracy (%)	80	—	Unverified
7	ResNet-50 (DeepAA)	Accuracy (%)	78.3	—	Unverified
8	ResNet-50 (TA wide)	Accuracy (%)	78.07	—	Unverified
9	ResNet-50 (LoRot-E)	Accuracy (%)	77.72	—	Unverified
10	ResNet-50 (LoRot-I)	Accuracy (%)	77.71	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WideResNet-40-2 (Faster AA)	Percentage error	3.7	—	Unverified
2	Shake-Shake (26 2×32d) (Faster AA)	Percentage error	2.7	—	Unverified
3	WideResNet-28-10 (Faster AA)	Percentage error	2.6	—	Unverified
4	Shake-Shake (26 2×112d) (Faster AA)	Percentage error	2	—	Unverified
5	Shake-Shake (26 2×96d) (Faster AA)	Percentage error	2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DiffAug	Classification Accuracy	92.7	—	Unverified
2	PaCMAP	Classification Accuracy	85.3	—	Unverified
3	hNNE	Classification Accuracy	77.4	—	Unverified
4	TopoAE	Classification Accuracy	74.6	—	Unverified