Data Augmentation

Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting.

Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 7351–7400 of 8378 papers

Title	Date	Tasks	Status
Low-resource neural machine translation with morphological modeling	Apr 3, 2024	Data AugmentationDecoder	CodeCode Available
Low Resource Text Classification with ULMFit and Backtranslation	Mar 21, 2019	ClassificationData Augmentation	CodeCode Available
Research Trends and Applications of Data Augmentation Algorithms	Jul 18, 2022	Data Augmentation	CodeCode Available
Use of What-if Scenarios to Help Explain Artificial Intelligence Models for Neonatal Health	Oct 12, 2024	counterfactualData Augmentation	CodeCode Available
Turning Flowchart into Dialog: Augmenting Flowchart-grounded Troubleshooting Dialogs via Synthetic Data Generation	May 2, 2023	Data AugmentationResponse Generation	CodeCode Available
Multi-task Pre-training Language Model for Semantic Network Completion	Jan 13, 2022	Contrastive LearningData Augmentation	CodeCode Available
Turning Waste into Wealth: Leveraging Low-Quality Samples for Enhancing Continuous Conditional Generative Adversarial Networks	Aug 20, 2023	Data Augmentation	CodeCode Available
Guiding Through Complexity: What Makes Good Supervision for Hard Reasoning Tasks?	Oct 27, 2024	Data AugmentationMath	CodeCode Available
TwinCL: A Twin Graph Contrastive Learning Model for Collaborative Filtering	Sep 27, 2024	Collaborative FilteringContrastive Learning	CodeCode Available
Use Random Selection for Now: Investigation of Few-Shot Selection Strategies in LLM-based Text Augmentation for Classification	Oct 14, 2024	Data AugmentationFew-Shot Learning	CodeCode Available
Time Series Data Augmentation as an Imbalanced Learning Problem	Apr 29, 2024	Data AugmentationTime Series	CodeCode Available
Beyond Deterministic Translation for Unsupervised Domain Adaptation	Feb 15, 2022	Data AugmentationDomain Adaptation	CodeCode Available
LUMix: Improving Mixup by Better Modelling Label Uncertainty	Nov 29, 2022	Data Augmentation	CodeCode Available
Lund jet images from generative and cycle-consistent adversarial networks	Sep 3, 2019	Data Augmentation	CodeCode Available
A Comparison of Deep Learning Methods for Cell Detection in Digital Cytology	Apr 9, 2025	Cell DetectionComputational Efficiency	CodeCode Available
SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR	Dec 7, 2024	Automatic Speech RecognitionData Augmentation	CodeCode Available
GuidedMixup: An Efficient Mixup Strategy Guided by Saliency Maps	Jun 29, 2023	Data Augmentation	CodeCode Available
Data, Depth, and Design: Learning Reliable Models for Skin Lesion Analysis	Nov 1, 2017	Data AugmentationTransfer Learning	CodeCode Available
Lung Swapping Autoencoder: Learning a Disentangled Structure-texture Representation of Chest Radiographs	Jan 18, 2022	Data Augmentation	CodeCode Available
Data-Centric Strategies for Overcoming PET/CT Heterogeneity: Insights from the AutoPET III Lesion Segmentation Challenge	Sep 16, 2024	Data AugmentationLesion Segmentation	CodeCode Available
SRE-Conv: Symmetric Rotation Equivariant Convolution for Biomedical Image Classification	Jan 16, 2025	Data Augmentationimage-classification	CodeCode Available
m2caiSeg: Semantic Segmentation of Laparoscopic Images using Convolutional Neural Networks	Aug 23, 2020	AnatomyData Augmentation	CodeCode Available
Data Augmented 3D Semantic Scene Completion with 2D Segmentation Priors	Nov 26, 2021	3D geometry3D Semantic Scene Completion	CodeCode Available
Data Augmentation with Variational Autoencoder for Imbalanced Dataset	Dec 9, 2024	Data Augmentationregression	CodeCode Available
Appearance and Pose-Conditioned Human Image Generation using Deformable GANs	Apr 30, 2019	Data AugmentationGenerative Adversarial Network	CodeCode Available
Aplicación de redes neuronales convolucionales profundas al diagnóstico asistido de la enfermedad de Alzheimer	Oct 15, 2022	Data AugmentationTransfer Learning	CodeCode Available
Aggression Identification Using Deep Learning and Data Augmentation	Aug 1, 2018	Aggression IdentificationData Augmentation	CodeCode Available
ResVG: Enhancing Relation and Semantic Understanding in Multiple Instances for Visual Grounding	Aug 29, 2024	Data AugmentationImage Generation	CodeCode Available
Machine learning approaches for automatic defect detection in photovoltaic systems	Sep 24, 2024	Data AugmentationDeep Learning	CodeCode Available
Use the Detection Transformer as a Data Augmenter	Apr 10, 2023	Data Augmentationimage-classification	CodeCode Available
APAR: Modeling Irregular Target Functions in Tabular Regression via Arithmetic-Aware Pre-Training and Adaptive-Regularized Fine-Tuning	Dec 14, 2024	Data Augmentationtabular-regression	CodeCode Available
Data Augmentation with Atomic Templates for Spoken Language Understanding	Aug 28, 2019	Data AugmentationDecoder	CodeCode Available
Machine learning for rapid discovery of laminar flow channel wall modifications that enhance heat transfer	Jan 19, 2021	BIG-bench Machine LearningData Augmentation	CodeCode Available
Better May Not Be Fairer: A Study on Subgroup Discrepancy in Image Classification	Dec 16, 2022	Data Augmentationimage-classification	CodeCode Available
Machine Learning Models that Remember Too Much	Sep 22, 2017	BIG-bench Machine LearningData Augmentation	CodeCode Available
QuestGen: Effectiveness of Question Generation Methods for Fact-Checking Applications	Jul 31, 2024	Data AugmentationFact Checking	CodeCode Available
SSL-DG: Rethinking and Fusing Semi-supervised Learning and Domain Generalization in Medical Image Segmentation	Nov 5, 2023	Data AugmentationDomain Generalization	CodeCode Available
GSDFuse: Capturing Cognitive Inconsistencies from Multi-Dimensional Weak Signals in Social Media Steganalysis	May 20, 2025	Data AugmentationFeature Engineering	CodeCode Available
Data Augmentation View on Graph Convolutional Network and the Proposal of Monte Carlo Graph Learning	Jun 23, 2020	Data AugmentationGraph Learning	CodeCode Available
Rethinking Data Augmentation for Low-Resource Neural Machine Translation: A Multi-Task Learning Approach	Sep 8, 2021	Data AugmentationDecoder	CodeCode Available
Data Augmentation via Levy Processes	Mar 21, 2016	Data AugmentationImage Augmentation	CodeCode Available
Greedy AutoAugment	Aug 2, 2019	Data Augmentation	CodeCode Available
Data Augmentation via Dependency Tree Morphing for Low-Resource Languages	Mar 22, 2019	Data AugmentationPart-Of-Speech Tagging	CodeCode Available
GraphVICRegHSIC: Towards improved self-supervised representation learning for graphs with a hyrbid loss function	May 25, 2021	Data AugmentationRepresentation Learning	CodeCode Available
GraphMAD: Graph Mixup for Data Augmentation using Data-Driven Convex Clustering	Oct 27, 2022	ClusteringData Augmentation	CodeCode Available
Graph Contrastive Learning for Connectome Classification	Feb 7, 2025	ClassificationContrastive Learning	CodeCode Available
Data augmentation using synthetic data for time series classification with deep residual networks	Aug 7, 2018	Data AugmentationDynamic Time Warping	CodeCode Available
Data Augmentation using Random Image Cropping and Patching for Deep CNNs	Nov 22, 2018	Data AugmentationImage Augmentation	CodeCode Available
Exploring Inconsistent Knowledge Distillation for Object Detection with Data Augmentation	Sep 20, 2022	Data AugmentationKnowledge Distillation	CodeCode Available
Better Language Models of Code through Self-Improvement	Apr 2, 2023	Code SummarizationData Augmentation	CodeCode Available

Show:10 25 50

← PrevPage 148 of 168Next →

All datasets ImageNet CIFAR-10 GA1457

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	DeiT-B (+MixPro)	Accuracy (%)	82.9	—	Unverified
2	ResNet-200 (DeepAA)	Accuracy (%)	81.32	—	Unverified
3	DeiT-S (+MixPro)	Accuracy (%)	81.3	—	Unverified
4	ResNet-200 (Fast AA)	Accuracy (%)	80.6	—	Unverified
5	ResNet-200 (UA)	Accuracy (%)	80.4	—	Unverified
6	ResNet-200 (AA)	Accuracy (%)	80	—	Unverified
7	ResNet-50 (DeepAA)	Accuracy (%)	78.3	—	Unverified
8	ResNet-50 (TA wide)	Accuracy (%)	78.07	—	Unverified
9	ResNet-50 (LoRot-E)	Accuracy (%)	77.72	—	Unverified
10	ResNet-50 (LoRot-I)	Accuracy (%)	77.71	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WideResNet-40-2 (Faster AA)	Percentage error	3.7	—	Unverified
2	Shake-Shake (26 2×32d) (Faster AA)	Percentage error	2.7	—	Unverified
3	WideResNet-28-10 (Faster AA)	Percentage error	2.6	—	Unverified
4	Shake-Shake (26 2×112d) (Faster AA)	Percentage error	2	—	Unverified
5	Shake-Shake (26 2×96d) (Faster AA)	Percentage error	2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DiffAug	Classification Accuracy	92.7	—	Unverified
2	PaCMAP	Classification Accuracy	85.3	—	Unverified
3	hNNE	Classification Accuracy	77.4	—	Unverified
4	TopoAE	Classification Accuracy	74.6	—	Unverified