Data Augmentation

Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting.

Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1851–1900 of 8378 papers

Title	Date	Tasks	Status	Score
Learning the Difference that Makes a Difference with Counterfactually-Augmented Data	Sep 26, 2019	counterfactualData Augmentation	CodeCode Available	5
Improving the Robustness of Question Answering Systems to Question Paraphrasing	Jul 1, 2019	Data AugmentationQuestion Answering	CodeCode Available	5
Audiogmenter: a MATLAB Toolbox for Audio Data Augmentation	Dec 11, 2019	Audio ClassificationData Augmentation	CodeCode Available	5
Learning to Compose Domain-Specific Transformations for Data Augmentation	Sep 6, 2017	Data AugmentationImage Augmentation	CodeCode Available	5
Improving Systematic Generalization Through Modularity and Augmentation	Feb 22, 2022	Data AugmentationGrounded language learning	CodeCode Available	5
Improving Socratic Question Generation using Data Augmentation and Preference Optimization	Mar 1, 2024	Data AugmentationQuestion Generation	CodeCode Available	5
Learning to Recombine and Resample Data for Compositional Generalization	Oct 8, 2020	Data AugmentationInstruction Following	CodeCode Available	5
Improving singing voice separation with the Wave-U-Net using Minimum Hyperspherical Energy	Oct 22, 2019	Data Augmentationimage-classification	CodeCode Available	5
A Lightweight Privacy-Preserving Scheme Using Label-based Pixel Block Mixing for Image Classification in Deep Learning	May 19, 2021	Data AugmentationDeep Learning	CodeCode Available	5
CROP: Towards Distributional-Shift Robust Reinforcement Learning using Compact Reshaped Observation Processing	Apr 26, 2023	Data AugmentationDiversity	CodeCode Available	5
Aggression Identification Using Deep Learning and Data Augmentation	Aug 1, 2018	Aggression IdentificationData Augmentation	CodeCode Available	5
Learning Tree-Structured Composition of Data Augmentation	Aug 26, 2024	Contrastive LearningData Augmentation	CodeCode Available	5
Improving Skeleton-based Action Recognition with Interactive Object Information	Jan 9, 2025	Action RecognitionData Augmentation	CodeCode Available	5
A Byte Sequence is Worth an Image: CNN for File Fragment Classification Using Bit Shift and n-Gram Embeddings	Apr 14, 2023	Data Augmentation	CodeCode Available	5
Cross-dataset COVID-19 Transfer Learning with Cough Detection, Cough Segmentation, and Data Augmentation	Oct 12, 2022	Data AugmentationTransfer Learning	CodeCode Available	5
Scaling Up Single Image Dehazing Algorithm by Cross-Data Vision Alignment for Richer Representation Learning and Beyond	Jul 20, 2024	Data AugmentationImage Dehazing	CodeCode Available	5
A Survey of Data Synthesis Approaches	Jul 4, 2024	Data AugmentationDiversity	CodeCode Available	5
Improving satellite imagery segmentation using multiple Sentinel-2 revisits	Sep 25, 2024	Data AugmentationDensity Estimation	CodeCode Available	5
Cross-Domain Face Synthesis using a Controllable GAN	Oct 31, 2019	Data AugmentationFace Generation	CodeCode Available	5
Constructing Multiple Tasks for Augmentation: Improving Neural Image Classification With K-means Features	Nov 18, 2019	ClusteringData Augmentation	CodeCode Available	5
Improving SSVEP BCI Spellers With Data Augmentation and Language Models	Dec 28, 2024	Brain Computer InterfaceData Augmentation	CodeCode Available	5
IMSurReal Too: IMS in the Surface Realization Shared Task 2020	Dec 1, 2020	Data Augmentation	CodeCode Available	5
Constructing Contrastive samples via Summarization for Text Classification with limited annotations	Apr 11, 2021	Contrastive LearningData Augmentation	CodeCode Available	5
Improving Reading Comprehension Question Generation with Data Augmentation and Overgenerate-and-rank	Jun 15, 2023	Data AugmentationQuestion Generation	CodeCode Available	5
Leveraging Content and Context Cues for Low-Light Image Enhancement	Dec 10, 2024	Data AugmentationFace Detection	CodeCode Available	5
Leveraging Data Augmentation for Process Information Extraction	Apr 11, 2024	Data AugmentationRelation Extraction	CodeCode Available	5
Consistency Training by Synthetic Question Generation for Conversational Question Answering	Apr 17, 2024	Conversational Question AnsweringData Augmentation	CodeCode Available	5
Improving Novelty Detection using the Reconstructions of Nearest Neighbours	Nov 11, 2021	Anomaly DetectionData Augmentation	CodeCode Available	5
Improving Robustness by Augmenting Training Sentences with Predicate-Argument Structures	Oct 23, 2020	Data AugmentationSentence	CodeCode Available	5
Consistency of augmentation graph and network approximability in contrastive learning	Feb 6, 2025	Contrastive LearningData Augmentation	CodeCode Available	5
Improving Neural Machine Translation Robustness via Data Augmentation: Beyond Back-Translation	Nov 1, 2019	Data AugmentationDiversity	CodeCode Available	5
Cross-lingual Back-Parsing: Utterance Synthesis from Meaning Representation for Zero-Resource Semantic Parsing	Oct 1, 2024	Cross-Lingual TransferData Augmentation	CodeCode Available	5
Improving LSTM-CTC based ASR performance in domains with limited training data	Jul 3, 2017	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	5
Combining Data Generation and Active Learning for Low-Resource Question Answering	Nov 27, 2022	Active LearningAnswer Generation	CodeCode Available	5
Improving Neural Networks for Time Series Forecasting using Data Augmentation and AutoML	Mar 2, 2021	AutoMLBIG-bench Machine Learning	CodeCode Available	5
Cross-Lingual Text Classification of Transliterated Hindi and Malayalam	Aug 31, 2021	BenchmarkingClassification	CodeCode Available	5
Improving robustness to corruptions with multiplicative weight perturbations	Jun 24, 2024	Data Augmentationimage-classification	CodeCode Available	5
Improving Generalization for Multimodal Fake News Detection	May 29, 2023	Data AugmentationFake News Detection	CodeCode Available	5
Conjugate Bayesian Two-step Change Point Detection for Hawkes Process	Sep 26, 2024	Change Point DetectionComputational Efficiency	CodeCode Available	5
A Geometry-Sensitive Approach for Photographic Style Classification	Sep 3, 2019	ClassificationData Augmentation	CodeCode Available	5
Augmentation Backdoors	Sep 29, 2022	Data Augmentation	CodeCode Available	5
Improving Grammatical Error Correction via Contextual Data Augmentation	Jun 25, 2024	Data AugmentationGrammatical Error Correction	CodeCode Available	5
A little goes a long way: Improving toxic language classification despite data scarcity	Sep 25, 2020	Data AugmentationGeneral Classification	CodeCode Available	5
Cross-modal tumor segmentation using generative blending augmentation and self training	Apr 4, 2023	Data AugmentationImage Generation	CodeCode Available	5
15,500 Seconds: Lean UAV Classification Leveraging PEFT and Pre-Trained Networks	May 21, 2025	Audio ClassificationData Augmentation	CodeCode Available	5
Improving In-Context Learning with Reasoning Distillation	Apr 14, 2025	ARCData Augmentation	CodeCode Available	5
Improving Robustness via Tilted Exponential Layer: A Communication-Theoretic Perspective	Nov 2, 2023	Data Augmentation	CodeCode Available	5
Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation	Oct 25, 2023	Conversational RecommendationData Augmentation	CodeCode Available	5
Improving Robustness by Enhancing Weak Subnets	Jan 30, 2022	Adversarial RobustnessData Augmentation	CodeCode Available	5
A Generative Model of Symmetry Transformations	Mar 4, 2024	Data Augmentationmodel	CodeCode Available	5

Show:10 25 50

← PrevPage 38 of 168Next →

All datasets ImageNet CIFAR-10 GA1457

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	DeiT-B (+MixPro)	Accuracy (%)	82.9	—	Unverified
2	ResNet-200 (DeepAA)	Accuracy (%)	81.32	—	Unverified
3	DeiT-S (+MixPro)	Accuracy (%)	81.3	—	Unverified
4	ResNet-200 (Fast AA)	Accuracy (%)	80.6	—	Unverified
5	ResNet-200 (UA)	Accuracy (%)	80.4	—	Unverified
6	ResNet-200 (AA)	Accuracy (%)	80	—	Unverified
7	ResNet-50 (DeepAA)	Accuracy (%)	78.3	—	Unverified
8	ResNet-50 (TA wide)	Accuracy (%)	78.07	—	Unverified
9	ResNet-50 (LoRot-E)	Accuracy (%)	77.72	—	Unverified
10	ResNet-50 (LoRot-I)	Accuracy (%)	77.71	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WideResNet-40-2 (Faster AA)	Percentage error	3.7	—	Unverified
2	Shake-Shake (26 2×32d) (Faster AA)	Percentage error	2.7	—	Unverified
3	WideResNet-28-10 (Faster AA)	Percentage error	2.6	—	Unverified
4	Shake-Shake (26 2×112d) (Faster AA)	Percentage error	2	—	Unverified
5	Shake-Shake (26 2×96d) (Faster AA)	Percentage error	2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DiffAug	Classification Accuracy	92.7	—	Unverified
2	PaCMAP	Classification Accuracy	85.3	—	Unverified
3	hNNE	Classification Accuracy	77.4	—	Unverified
4	TopoAE	Classification Accuracy	74.6	—	Unverified