Data Augmentation

Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting.

Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 5401–5450 of 8378 papers

Title	Date	Tasks	Status
The ADAPT Centre’s Neural MT Systems for the WAT 2020 Document-Level Translation Task	Dec 1, 2020	Data AugmentationMachine Translation	—Unverified
The AI Mechanic: Acoustic Vehicle Characterization Neural Networks	May 19, 2022	Data AugmentationFault Detection	—Unverified
The ASRU 2019 Mandarin-English Code-Switching Speech Recognition Challenge: Open Datasets, Tracks, Methods and Results	Jul 12, 2020	Data AugmentationLanguage Identification	—Unverified
The Benefits of Mixup for Feature Learning	Mar 15, 2023	Data Augmentation	—Unverified
The CUHK-TENCENT speaker diarization system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge	Feb 4, 2022	Action DetectionActivity Detection	—Unverified
The Curious Case of Benign Memorization	Oct 25, 2022	Data AugmentationMemorization	—Unverified
The data augmentation algorithm	Jun 15, 2024	Data Augmentation	—Unverified
The DKU Replay Detection System for the ASVspoof 2019 Challenge: On Data Augmentation, Feature Representation, Classification, and Fusion	Jul 5, 2019	Data AugmentationGeneral Classification	—Unverified
The Dual-use Dilemma in LLMs: Do Empowering Ethical Capacities Make a Degraded Utility?	Jan 20, 2025	Data AugmentationQuestion Answering	—Unverified
The Effectiveness of Data Augmentation for Detection of Gastrointestinal Diseases from Endoscopical Images	Dec 11, 2017	Data AugmentationGeneral Classification	—Unverified
The Effect of Data Augmentation on Classification of Atrial Fibrillation in Short Single-Lead ECG Signals Using Deep Neural Networks	Feb 7, 2020	ClassificationData Augmentation	—Unverified
The effects of gender bias in word embeddings on depression prediction	Dec 15, 2022	Data AugmentationWord Embeddings	—Unverified
The Effects of Hallucinations in Synthetic Training Data for Relation Extraction	Oct 10, 2024	Data AugmentationKnowledge Graphs	—Unverified
The Effects of Regularization and Data Augmentation are Class Dependent	Apr 7, 2022	Data Augmentation	—Unverified
The Efficacy of Semantics-Preserving Transformations in Self-Supervised Learning for Medical Ultrasound	Apr 10, 2025	ClassificationData Augmentation	—Unverified
The FBK Participation in the WMT 2016 Automatic Post-editing Shared Task	Aug 1, 2016	Automatic Post-EditingData Augmentation	—Unverified
The FruitShell French synthesis system at the Blizzard 2023 Challenge	Sep 1, 2023	Data AugmentationSpeech Synthesis	—Unverified
The Geometry of Self-supervised Learning Models and its Impact on Transfer Learning	Sep 18, 2022	Data AugmentationSelf-Supervised Learning	—Unverified
The Hidden Influence of Latent Feature Magnitude When Learning with Imbalanced Data	Jul 14, 2024	Data AugmentationPrediction	—Unverified
The identification of garbage dumps in the rural areas of Cyprus through the application of deep learning to satellite imagery	Jul 23, 2023	Data Augmentation	—Unverified
The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them)	May 1, 2025	Data Augmentation	—Unverified
The Imaginative Generative Adversarial Network: Automatic Data Augmentation for Dynamic Skeleton-Based Hand Gesture and Human Action Recognition	May 27, 2021	Action RecognitionData Augmentation	—Unverified
The Impact of Code-switched Synthetic Data Quality is Task Dependent: Insights from MT and ASR	Mar 30, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
The Impact of Frequency Bands on Acoustic Anomaly Detection of Machines using Deep Learning Based Model	Mar 1, 2024	Anomaly DetectionData Augmentation	—Unverified
The Impact of Preprocessing on Deep Representations for Iris Recognition on Unconstrained Environments	Aug 29, 2018	Data AugmentationIris Recognition	—Unverified
The Importance of Importance Sampling for Deep Budgeted Training	Jan 1, 2021	Data Augmentation	—Unverified
The Influences of Color and Shape Features in Visual Contrastive Learning	Jan 29, 2023	Contrastive LearningData Augmentation	—Unverified
The JHU-Microsoft Submission for WMT21 Quality Estimation Shared Task	Sep 17, 2021	Data AugmentationTask 2	—Unverified
The LMU Munich System for the WMT 2021 Large-Scale Multilingual Machine Translation Shared Task	Nov 1, 2021	Data AugmentationKnowledge Distillation	—Unverified
The LMU System for the CoNLL-SIGMORPHON 2017 Shared Task on Universal Morphological Reinflection	Aug 1, 2017	Data AugmentationDomain Adaptation	—Unverified
The Meaning Factory at SemEval-2017 Task 9: Producing AMRs with Neural Semantic Parsing	Apr 7, 2017	Data AugmentationPOS	—Unverified
The MLLP-UPV German-English Machine Translation System for WMT18	Oct 1, 2018	Data AugmentationMachine Translation	—Unverified
The NIST CTS Speaker Recognition Challenge	Apr 21, 2022	Data AugmentationSpeaker Recognition	—Unverified
The Notary in the Haystack -- Countering Class Imbalance in Document Processing with CNNs	Jul 15, 2020	Binary ClassificationClassification	—Unverified
The NTNU System at the Interspeech 2020 Non-Native Children's Speech ASR Challenge	May 18, 2020	Data AugmentationDiversity	—Unverified
The NTNU Taiwanese ASR System for Formosa Speech Recognition Challenge 2020	Apr 9, 2021	Data AugmentationSpeech Enhancement	—Unverified
The NUS-HLT System for ICASSP2024 ICMC-ASR Grand Challenge	Dec 26, 2023	Automatic Speech RecognitionData Augmentation	—Unverified
The Only Chance to Understand: Machine Translation of the Severely Endangered Low-resource Languages of Eurasia	Oct 1, 2022	Data AugmentationLanguage Modeling	—Unverified
Theoretical Analysis of Consistency Regularization with Limited Augmented Data	Sep 29, 2021	Data AugmentationGeneralization Bounds	—Unverified
Theoretical and Empirical Study of Adversarial Examples	Sep 27, 2018	Data Augmentation	—Unverified
Theoretical Guarantees of Data Augmented Last Layer Retraining Methods	May 9, 2024	Data Augmentation	—Unverified
The Outcome of the 2022 Landslide4Sense Competition: Advanced Landslide Detection from Multi-Source Satellite Imagery	Sep 6, 2022	Data AugmentationSemantic Segmentation	—Unverified
The Penalty Imposed by Ablated Data Augmentation	Jun 8, 2020	Data Augmentation	—Unverified
The Perception of Phase Intercept Distortion and its Application in Data Augmentation	Jun 17, 2025	Data Augmentation	—Unverified
The Pipeline System of ASR and NLU with MLM-based Data Augmentation toward STOP Low-resource Challenge	May 2, 2023	Data AugmentationDomain Adaptation	—Unverified
The Potential of Neural Speech Synthesis-based Data Augmentation for Personalized Speech Enhancement	Nov 14, 2022	Data AugmentationSpeech Enhancement	—Unverified
The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation	May 24, 2025	Data Augmentation	—Unverified
Learning ABCs: Approximate Bijective Correspondence for isolating factors of variation with weak supervision	Mar 4, 2021	Data AugmentationPose Transfer	—Unverified
Thermal-Infrared Remote Target Detection System for Maritime Rescue based on Data Augmentation with 3D Synthetic Data	Oct 31, 2023	Data AugmentationDomain Adaptation	—Unverified
The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition	May 14, 2024	Autonomous DrivingData Augmentation	—Unverified

Show:10 25 50

← PrevPage 109 of 168Next →

All datasets ImageNet CIFAR-10 GA1457

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	DeiT-B (+MixPro)	Accuracy (%)	82.9	—	Unverified
2	ResNet-200 (DeepAA)	Accuracy (%)	81.32	—	Unverified
3	DeiT-S (+MixPro)	Accuracy (%)	81.3	—	Unverified
4	ResNet-200 (Fast AA)	Accuracy (%)	80.6	—	Unverified
5	ResNet-200 (UA)	Accuracy (%)	80.4	—	Unverified
6	ResNet-200 (AA)	Accuracy (%)	80	—	Unverified
7	ResNet-50 (DeepAA)	Accuracy (%)	78.3	—	Unverified
8	ResNet-50 (TA wide)	Accuracy (%)	78.07	—	Unverified
9	ResNet-50 (LoRot-E)	Accuracy (%)	77.72	—	Unverified
10	ResNet-50 (LoRot-I)	Accuracy (%)	77.71	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WideResNet-40-2 (Faster AA)	Percentage error	3.7	—	Unverified
2	Shake-Shake (26 2×32d) (Faster AA)	Percentage error	2.7	—	Unverified
3	WideResNet-28-10 (Faster AA)	Percentage error	2.6	—	Unverified
4	Shake-Shake (26 2×112d) (Faster AA)	Percentage error	2	—	Unverified
5	Shake-Shake (26 2×96d) (Faster AA)	Percentage error	2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DiffAug	Classification Accuracy	92.7	—	Unverified
2	PaCMAP	Classification Accuracy	85.3	—	Unverified
3	hNNE	Classification Accuracy	77.4	—	Unverified
4	TopoAE	Classification Accuracy	74.6	—	Unverified