SOTAVerified

Data Augmentation

Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting.

Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others.

Further readings:

( Image credit: Albumentations )

Papers

Showing 10011050 of 8378 papers

TitleStatusHype
Correlation-Aware Select and Merge Attention for Efficient Fine-Tuning and Context Length Extension0
RFBoost: Understanding and Boosting Deep WiFi Sensing via Physical Data AugmentationCode1
CUDLE: Learning Under Label Scarcity to Detect Cannabis Use in Uncontrolled Environments0
Investigating and Mitigating Object Hallucinations in Pretrained Vision-Language (CLIP) ModelsCode0
Comparative Analysis and Ensemble Enhancement of Leading CNN Architectures for Breast Cancer Classification0
Cognitive Biases in Large Language Models for News Recommendation0
AlzhiNet: Traversing from 2DCNN to 3DCNN, Towards Early Detection and Diagnosis of Alzheimer's Disease0
Can Language Models Take A Hint? Prompting for Controllable Contextualized Commonsense Inference0
QDGset: A Large Scale Grasping Dataset Generated with Quality-Diversity0
A Novel Method for Accurate & Real-time Food Classification: The Synergistic Integration of EfficientNetB7, CBAM, Transfer Learning, and Data Augmentation0
Capturing complex hand movements and object interactions using machine learning-powered stretchable smart textile glovesCode1
SAFLEX: Self-Adaptive Augmentation via Feature Label Extrapolation0
TAEGAN: Generating Synthetic Tabular Data For Data Augmentation0
Generate then Refine: Data Augmentation for Zero-shot Intent DetectionCode0
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic DataCode1
PersonaMath: Enhancing Math Reasoning through Persona-Driven Data Augmentation0
Intent Detection in the Age of LLMs0
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard ModelsCode1
Formula-Driven Data Augmentation and Partial Retinal Layer Copying for Retinal Layer Segmentation0
Ensembles provably learn equivariance through data augmentationCode0
Equivariant score-based generative models provably learn distributions with symmetries efficiently0
Data Extrapolation for Text-to-image Generation on Small DatasetsCode1
ProxiMix: Enhancing Fairness with Proximity Samples in Subgroups0
From Natural Language to SQL: Review of LLM-based Text-to-SQL Systems0
Pseudo-Non-Linear Data Augmentation via Energy Minimization0
Augmentation through Laundering Attacks for Audio Spoof Detection0
Exploring Empty Spaces: Human-in-the-Loop Data AugmentationCode1
Targeted synthetic data generation for tabular data via hardness characterizationCode0
SyntheOcc: Synthesize Geometric-Controlled Street View Images through 3D Semantic MPIs0
Cross-lingual Back-Parsing: Utterance Synthesis from Meaning Representation for Zero-Resource Semantic ParsingCode0
RAD: A Dataset and Benchmark for Real-Life Anomaly Detection with Robotic ObservationsCode1
Accent conversion using discrete units with parallel data synthesized from controllable accented TTS0
Enhancing Romanian Offensive Language Detection through Knowledge Distillation, Multi-Task Learning, and Data Augmentation0
Depression detection in social media posts using transformer-based models and auxiliary features0
LLMEmb: Large Language Model Can Be a Good Embedding Generator for Sequential RecommendationCode2
Erase, then Redraw: A Novel Data Augmentation Approach for Free Space Detection Using Diffusion Model0
SWIM: Short-Window CNN Integrated with Mamba for EEG-Based Auditory Spatial Attention DecodingCode1
FlexSBDD: Structure-Based Drug Design with Flexible Protein Modeling0
DropEdge not Foolproof: Effective Augmentation Method for Signed Graph Neural Networks0
Membership Privacy Evaluation in Deep Spiking Neural Networks0
Introducing SDICE: An Index for Assessing Diversity of Synthetic Medical Datasets0
TwinCL: A Twin Graph Contrastive Learning Model for Collaborative FilteringCode0
Multi-modal Cross-domain Self-supervised Pre-training for fMRI and EEG Fusion0
HardCore Generation: Generating Hard UNSAT Problems for Data Augmentation0
Jump Diffusion-Informed Neural Networks with Transfer Learning for Accurate American Option Pricing under Data Scarcity0
Good Data Is All Imitation Learning Needs0
Enhancing elusive clues in knowledge learning by contrasting attention of language modelsCode0
Reducing and Exploiting Data Augmentation Noise through Meta Reweighting Contrastive Learning for Text Classification0
Conjugate Bayesian Two-step Change Point Detection for Hawkes ProcessCode0
Advancing Open-Set Domain Generalization Using Evidential Bi-Level Hardest Domain SchedulerCode0
Show:102550
← PrevPage 21 of 168Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DeiT-B (+MixPro)Accuracy (%)82.9Unverified
2ResNet-200 (DeepAA)Accuracy (%)81.32Unverified
3DeiT-S (+MixPro)Accuracy (%)81.3Unverified
4ResNet-200 (Fast AA)Accuracy (%)80.6Unverified
5ResNet-200 (UA)Accuracy (%)80.4Unverified
6ResNet-200 (AA)Accuracy (%)80Unverified
7ResNet-50 (DeepAA)Accuracy (%)78.3Unverified
8ResNet-50 (TA wide)Accuracy (%)78.07Unverified
9ResNet-50 (LoRot-E)Accuracy (%)77.72Unverified
10ResNet-50 (LoRot-I)Accuracy (%)77.71Unverified
#ModelMetricClaimedVerifiedStatus
1WideResNet-40-2 (Faster AA)Percentage error3.7Unverified
2Shake-Shake (26 2×32d) (Faster AA)Percentage error2.7Unverified
3WideResNet-28-10 (Faster AA)Percentage error2.6Unverified
4Shake-Shake (26 2×112d) (Faster AA)Percentage error2Unverified
5Shake-Shake (26 2×96d) (Faster AA)Percentage error2Unverified
#ModelMetricClaimedVerifiedStatus
1DiffAugClassification Accuracy92.7Unverified
2PaCMAPClassification Accuracy85.3Unverified
3hNNEClassification Accuracy77.4Unverified
4TopoAEClassification Accuracy74.6Unverified