Data Augmentation

Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting.

Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 451–475 of 8378 papers

Title	Date	Tasks	Status	Hype
TimeDistill: Efficient Long-Term Time Series Forecasting with MLP via Cross-Architecture Distillation	Feb 20, 2025	Data AugmentationKnowledge Distillation	—Unverified	0
External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation	Feb 20, 2025	Data Augmentation	—Unverified	0
CrossFuse: Learning Infrared and Visible Image Fusion by Cross-Sensor Top-K Vision Alignment and Beyond	Feb 20, 2025	Autonomous DrivingData Augmentation	—Unverified	0
Reducing false positives in strong lens detection through effective augmentation and ensemble learning	Feb 20, 2025	Data AugmentationDiversity	—Unverified	0
Humanoid-VLA: Towards Universal Humanoid Control with Visual Integration	Feb 20, 2025	Data AugmentationHumanoid Control	—Unverified	0
Image compositing is all you need for data augmentation	Feb 19, 2025	AllData Augmentation	—Unverified	0
AS-GCL: Asymmetric Spectral Augmentation on Graph Contrastive Learning	Feb 19, 2025	Contrastive LearningData Augmentation	—Unverified	0
Democratizing Large Language Model-Based Graph Data Augmentation via Latent Knowledge Graphs	Feb 19, 2025	Data AugmentationGraph Learning	CodeCode Available	0
Integrating Arithmetic Learning Improves Mathematical Reasoning in Smaller Models	Feb 18, 2025	Data AugmentationGSM8K	—Unverified	0
MVCNet: Multi-View Contrastive Network for Motor Imagery Classification	Feb 18, 2025	Brain Computer InterfaceContrastive Learning	CodeCode Available	1
Fake It Till You Make It: Using Synthetic Data and Domain Knowledge for Improved Text-Based Learning for LGE Detection	Feb 18, 2025	AnatomyData Augmentation	—Unverified	0
Generative AI Enabled Robust Data Augmentation for Wireless Sensing in ISAC Networks	Feb 18, 2025	Data AugmentationIntegrated sensing and communication	—Unverified	0
Myna: Masking-Based Contrastive Learning of Musical Representations	Feb 18, 2025	Contrastive LearningData Augmentation	CodeCode Available	1
Diversity-Oriented Data Augmentation with Large Language Models	Feb 17, 2025	Data AugmentationDiversity	—Unverified	0
Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu	Feb 17, 2025	Data AugmentationIn-Context Learning	CodeCode Available	1
Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarcity	Feb 17, 2025	Data Augmentation	—Unverified	0
SpeechT: Findings of the First Mentorship in Speech Translation	Feb 17, 2025	Data AugmentationTranslation	—Unverified	0
CARMA: Enhanced Compositionality in LLMs via Advanced Regularisation and Mutual Information Alignment	Feb 16, 2025	Data AugmentationSentiment Analysis	—Unverified	0
ReLearn: Unlearning via Learning for Large Language Models	Feb 16, 2025	Data AugmentationText Generation	CodeCode Available	1
AudioSpa: Spatializing Sound Events with Text	Feb 16, 2025	Audio GenerationData Augmentation	—Unverified	0
Generating Skyline Datasets for Data Science Models	Feb 16, 2025	Data Augmentation	—Unverified	0
A Mathematics Framework of Artificial Shifted Population Risk and Its Further Understanding Related to Consistency Regularization	Feb 15, 2025	Data Augmentationimbalanced classification	CodeCode Available	0
NeuroAMP: A Novel End-to-end General Purpose Deep Neural Amplifier for Personalized Hearing Aids	Feb 15, 2025	Data AugmentationDenoising	—Unverified	0
Rao-Blackwell Gradient Estimators for Equivariant Denoising Diffusion	Feb 14, 2025	Data AugmentationDenoising	—Unverified	0
Causal Information Prioritization for Efficient Reinforcement Learning	Feb 14, 2025	continuous-controlContinuous Control	—Unverified	0

Show:10 25 50

← PrevPage 19 of 336Next →

All datasets ImageNet CIFAR-10 GA1457

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	DeiT-B (+MixPro)	Accuracy (%)	82.9	—	Unverified
2	ResNet-200 (DeepAA)	Accuracy (%)	81.32	—	Unverified
3	DeiT-S (+MixPro)	Accuracy (%)	81.3	—	Unverified
4	ResNet-200 (Fast AA)	Accuracy (%)	80.6	—	Unverified
5	ResNet-200 (UA)	Accuracy (%)	80.4	—	Unverified
6	ResNet-200 (AA)	Accuracy (%)	80	—	Unverified
7	ResNet-50 (DeepAA)	Accuracy (%)	78.3	—	Unverified
8	ResNet-50 (TA wide)	Accuracy (%)	78.07	—	Unverified
9	ResNet-50 (LoRot-E)	Accuracy (%)	77.72	—	Unverified
10	ResNet-50 (LoRot-I)	Accuracy (%)	77.71	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WideResNet-40-2 (Faster AA)	Percentage error	3.7	—	Unverified
2	Shake-Shake (26 2×32d) (Faster AA)	Percentage error	2.7	—	Unverified
3	WideResNet-28-10 (Faster AA)	Percentage error	2.6	—	Unverified
4	Shake-Shake (26 2×112d) (Faster AA)	Percentage error	2	—	Unverified
5	Shake-Shake (26 2×96d) (Faster AA)	Percentage error	2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DiffAug	Classification Accuracy	92.7	—	Unverified
2	PaCMAP	Classification Accuracy	85.3	—	Unverified
3	hNNE	Classification Accuracy	77.4	—	Unverified
4	TopoAE	Classification Accuracy	74.6	—	Unverified