Data Augmentation

Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting.

Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3601–3650 of 8378 papers

Title	Date	Tasks	Status
Few-shot Learning using Data Augmentation and Time-Frequency Transformation for Time Series Classification	Nov 6, 2023	Data AugmentationFew-Shot Learning	—Unverified
Augment the Pairs: Semantics-Preserving Image-Caption Pair Augmentation for Grounding-Based Vision and Language Models	Nov 5, 2023	Data AugmentationPhrase Grounding	CodeCode Available
SSL-DG: Rethinking and Fusing Semi-supervised Learning and Domain Generalization in Medical Image Segmentation	Nov 5, 2023	Data AugmentationDomain Generalization	CodeCode Available
TreeSwap: Data Augmentation for Machine Translation via Dependency Subtree Swapping	Nov 4, 2023	Data AugmentationMachine Translation	CodeCode Available
Noise-Agnostic Quantum Error Mitigation with Data Augmented Neural Models	Nov 3, 2023	Data Augmentation	CodeCode Available
Comparative Knowledge Distillation	Nov 3, 2023	Data AugmentationKnowledge Distillation	CodeCode Available
Vicinal Risk Minimization for Few-Shot Cross-lingual Transfer in Abusive Language Detection	Nov 3, 2023	Abusive LanguageCross-Lingual Transfer	—Unverified
Tailoring Mixup to Data for Calibration	Nov 2, 2023	Data AugmentationDiversity	CodeCode Available
Improving Robustness via Tilted Exponential Layer: A Communication-Theoretic Perspective	Nov 2, 2023	Data Augmentation	CodeCode Available
Deep Double Descent for Time Series Forecasting: Avoiding Undertrained Models	Nov 2, 2023	Data AugmentationTime Series	—Unverified
People Make Better Edits: Measuring the Efficacy of LLM-Generated Counterfactually Augmented Data for Harmful Language Detection	Nov 2, 2023	Data Augmentation	CodeCode Available
Data Augmentation for Code Translation with Comparable Corpora and Multiple References	Nov 1, 2023	Code GenerationCode Translation	CodeCode Available
Rethinking Samples Selection for Contrastive Learning: Mining of Potential Samples	Nov 1, 2023	Contrastive LearningData Augmentation	—Unverified
C2C: Cough to COVID-19 Detection in BHI 2023 Data Challenge	Nov 1, 2023	COVID-19 DiagnosisData Augmentation	CodeCode Available
Bayes-enhanced Multi-view Attention Networks for Robust POI Recommendation	Nov 1, 2023	Data AugmentationRepresentation Learning	—Unverified
Is Robustness Transferable across Languages in Multilingual Neural Machine Translation?	Oct 31, 2023	Data AugmentationMachine Translation	—Unverified
Dynamic Batch Norm Statistics Update for Natural Robustness	Oct 31, 2023	Data Augmentation	—Unverified
Histopathological Image Analysis with Style-Augmented Feature Domain Mixing for Improved Generalization	Oct 31, 2023	Data AugmentationDomain Generalization	CodeCode Available
Thermal-Infrared Remote Target Detection System for Maritime Rescue based on Data Augmentation with 3D Synthetic Data	Oct 31, 2023	Data AugmentationDomain Adaptation	—Unverified
Addressing Limitations of State-Aware Imitation Learning for Autonomous Driving	Oct 31, 2023	Autonomous DrivingData Augmentation	—Unverified
A Lightweight Method to Generate Unanswerable Questions in English	Oct 30, 2023	Data AugmentationQuestion Answering	CodeCode Available
A Note on Generalization in Variational Autoencoders: How Effective Is Synthetic Data & Overparameterization?	Oct 30, 2023	Data AugmentationDeep Learning	—Unverified
TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise	Oct 29, 2023	Data AugmentationLanguage Modeling	—Unverified
On Linear Separation Capacity of Self-Supervised Representation Learning	Oct 29, 2023	Data AugmentationRepresentation Learning	—Unverified
Exploring Data Augmentations on Self-/Semi-/Fully- Supervised Pre-trained Models	Oct 28, 2023	Data AugmentationDiversity	—Unverified
ODM3D: Alleviating Foreground Sparsity for Semi-Supervised Monocular 3D Object Detection	Oct 28, 2023	3D Object DetectionAutonomous Driving	CodeCode Available
OC-NMN: Object-centric Compositional Neural Module Network for Generative Visual Analogical Reasoning	Oct 28, 2023	Data AugmentationOut-of-Distribution Generalization	—Unverified
Large-scale Foundation Models and Generative AI for BigData Neuroscience	Oct 27, 2023	Data AugmentationNatural Language Understanding	—Unverified
Guided Data Augmentation for Offline Reinforcement Learning and Imitation Learning	Oct 27, 2023	Autonomous DrivingD4RL	—Unverified
MixRep: Hidden Representation Mixup for Low-Resource Speech Recognition	Oct 27, 2023	Data Augmentationspeech-recognition	CodeCode Available
Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model	Oct 26, 2023	Data AugmentationGeneral Knowledge	CodeCode Available
Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models	Oct 26, 2023	Autonomous DrivingData Augmentation	—Unverified
PAC-tuning:Fine-tuning Pretrained Language Models with PAC-driven Perturbed Gradient Descent	Oct 26, 2023	Data AugmentationFew-Shot Learning	—Unverified
Dialect Adaptation and Data Augmentation for Low-Resource ASR: TalTech Systems for the MADASR 2023 Challenge	Oct 26, 2023	Automatic Speech RecognitionData Augmentation	—Unverified
Understanding when Dynamics-Invariant Data Augmentations Benefit Model-Free Reinforcement Learning Updates	Oct 26, 2023	Data Augmentationreinforcement-learning	CodeCode Available
Better integrating vision and semantics for improving few-shot classification	Oct 26, 2023	Data AugmentationPrompt Engineering	CodeCode Available
Data Augmentation for Emotion Detection in Small Imbalanced Text Data	Oct 25, 2023	Data AugmentationEmotion Recognition	CodeCode Available
Improving Few-shot Generalization of Safety Classifiers via Data Augmented Parameter-Efficient Fine-Tuning	Oct 25, 2023	Data AugmentationFew-Shot Learning	—Unverified
UAV-Sim: NeRF-based Synthetic Data Generation for UAV-based Perception	Oct 25, 2023	Data AugmentationImage Generation	—Unverified
Transferring a molecular foundation model for polymer property predictions	Oct 25, 2023	Data AugmentationTransfer Learning	—Unverified
Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation	Oct 25, 2023	Conversational RecommendationData Augmentation	CodeCode Available
Early Detection of Tuberculosis with Machine Learning Cough Audio Analysis: Towards More Accessible Global Triaging Usage	Oct 25, 2023	Data Augmentation	—Unverified
DualMatch: Robust Semi-Supervised Learning with Dual-Level Interaction	Oct 25, 2023	Data Augmentation	CodeCode Available
An Explainable Deep Learning-Based Method For Schizophrenia Diagnosis Using Generative Data-Augmentation	Oct 25, 2023	Data AugmentationEEG	—Unverified
Using GPT-4 to Augment Unbalanced Data for Automatic Scoring	Oct 25, 2023	Data AugmentationLanguage Modelling	—Unverified
Improving Language Models Meaning Understanding and Consistency by Learning Conceptual Roles from Dictionary	Oct 24, 2023	Data Augmentation	—Unverified
Improving Robustness and Reliability in Medical Image Classification with Latent-Guided Diffusion and Nested-Ensembles	Oct 24, 2023	Data Augmentationimage-classification	—Unverified
Towards contrast-agnostic soft segmentation of the spinal cord	Oct 23, 2023	Data AugmentationDomain Generalization	CodeCode Available
Statistical Depth for Ranking and Characterizing Transformer-Based Text Embeddings	Oct 23, 2023	Data AugmentationIn-Context Learning	CodeCode Available
S3Aug: Segmentation, Sampling, and Shift for Action Recognition	Oct 23, 2023	Action RecognitionData Augmentation	—Unverified

Show:10 25 50

← PrevPage 73 of 168Next →

All datasets ImageNet CIFAR-10 GA1457

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	DeiT-B (+MixPro)	Accuracy (%)	82.9	—	Unverified
2	ResNet-200 (DeepAA)	Accuracy (%)	81.32	—	Unverified
3	DeiT-S (+MixPro)	Accuracy (%)	81.3	—	Unverified
4	ResNet-200 (Fast AA)	Accuracy (%)	80.6	—	Unverified
5	ResNet-200 (UA)	Accuracy (%)	80.4	—	Unverified
6	ResNet-200 (AA)	Accuracy (%)	80	—	Unverified
7	ResNet-50 (DeepAA)	Accuracy (%)	78.3	—	Unverified
8	ResNet-50 (TA wide)	Accuracy (%)	78.07	—	Unverified
9	ResNet-50 (LoRot-E)	Accuracy (%)	77.72	—	Unverified
10	ResNet-50 (LoRot-I)	Accuracy (%)	77.71	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WideResNet-40-2 (Faster AA)	Percentage error	3.7	—	Unverified
2	Shake-Shake (26 2×32d) (Faster AA)	Percentage error	2.7	—	Unverified
3	WideResNet-28-10 (Faster AA)	Percentage error	2.6	—	Unverified
4	Shake-Shake (26 2×112d) (Faster AA)	Percentage error	2	—	Unverified
5	Shake-Shake (26 2×96d) (Faster AA)	Percentage error	2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DiffAug	Classification Accuracy	92.7	—	Unverified
2	PaCMAP	Classification Accuracy	85.3	—	Unverified
3	hNNE	Classification Accuracy	77.4	—	Unverified
4	TopoAE	Classification Accuracy	74.6	—	Unverified