SOTAVerified

Data Augmentation

Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting.

Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others.

Further readings:

( Image credit: Albumentations )

Papers

Showing 301350 of 8378 papers

TitleStatusHype
MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword SpottingCode1
3D MRI Synthesis with Slice-Based Latent Diffusion Models: Improving Tumor Segmentation Tasks in Data-Scarce RegimesCode1
TabPFGen -- Tabular Data Generation with TabPFNCode1
Diff-Mosaic: Augmenting Realistic Representations in Infrared Small Target Detection via Diffusion PriorCode1
Diffusion-based Image Generation for In-distribution Data Augmentation in Surface Defect DetectionCode1
Causal Action Influence Aware Counterfactual Data AugmentationCode1
GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement LearningCode1
A Recipe for Unbounded Data Augmentation in Visual Reinforcement LearningCode1
USD: Unsupervised Soft Contrastive Learning for Fault Detection in Multivariate Time SeriesCode1
Membership Inference on Text-to-Image Diffusion Models via Conditional Likelihood DiscrepancyCode1
Boosted Neural Decoders: Achieving Extreme Reliability of LDPC Codes for 6G NetworksCode1
Mosaic-IT: Free Compositional Data Augmentation Improves Instruction TuningCode1
Simultaneous Masking, Not Prompting Optimization: A Paradigm Shift in Fine-tuning LLMs for Simultaneous TranslationCode1
Cross-Domain Feature Augmentation for Domain GeneralizationCode1
ACTION: Augmentation and Computation Toolbox for Brain Network Analysis with Functional MRICode1
Universal Adversarial Perturbations for Vision-Language Pre-trained ModelsCode1
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language ModelsCode1
AugmenTory: A Fast and Flexible Polygon Augmentation LibraryCode1
Provably Unlearnable Data ExamplesCode1
KID-PPG: Knowledge Informed Deep Learning for Extracting Heart Rate from a SmartwatchCode1
RaffeSDG: Random Frequency Filtering enabled Single-source Domain Generalization for Medical Image SegmentationCode1
AAPL: Adding Attributes to Prompt Learning for Vision-Language ModelsCode1
MambaPupil: Bidirectional Selective Recurrent model for Event-based Eye trackingCode1
The Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned DataCode1
RoofDiffusion: Constructing Roofs from Severely Corrupted Point Data via DiffusionCode1
An evaluation framework for synthetic data generation modelsCode1
FashionFail: Addressing Failure Cases in Fashion Object Detection and SegmentationCode1
AnnoCTR: A Dataset for Detecting and Linking Entities, Tactics, and Techniques in Cyber Threat ReportsCode1
ORacle: Large Vision-Language Models for Knowledge-Guided Holistic OR Domain ModelingCode1
FPL+: Filtered Pseudo Label-based Unsupervised Cross-Modality Adaptation for 3D Medical Image SegmentationCode1
PairAug: What Can Augmented Image-Text Pairs Do for Radiology?Code1
JUICER: Data-Efficient Imitation Learning for Robotic AssemblyCode1
LiteNeXt: A Novel Lightweight ConvMixer-based Model with Self-embedding Representation Parallel for Medical Image SegmentationCode1
ContrastCAD: Contrastive Learning-based Representation Learning for Computer-Aided Design ModelsCode1
Source-Aware Training Enables Knowledge Attribution in Language ModelsCode1
Enhance Image Classification via Inter-Class Image Mixup with Diffusion ModelCode1
GeNet: A Graph Neural Network-based Anti-noise Task-Oriented Semantic Communication ParadigmCode1
MMIDR: Teaching Large Language Model to Interpret Multimodal Misinformation via Knowledge DistillationCode1
RigorLLM: Resilient Guardrails for Large Language Models against Undesired ContentCode1
TexTile: A Differentiable Metric for Texture TileabilityCode1
DreamDA: Generative Data Augmentation with Diffusion ModelsCode1
Do Generated Data Always Help Contrastive Learning?Code1
SETA: Semantic-Aware Token Augmentation for Domain GeneralizationCode1
GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised LearningCode1
Scaling Data Diversity for Fine-Tuning Language Models in Human AlignmentCode1
Is Contrastive Learning Necessary? A Study of Data Augmentation vs Contrastive Learning in Sequential RecommendationCode1
YOLOv9 for Fracture Detection in Pediatric Wrist Trauma X-ray ImagesCode1
SF(DA)^2: Source-free Domain Adaptation Through the Lens of Data AugmentationCode1
EquiAV: Leveraging Equivariance for Audio-Visual Contrastive LearningCode1
EventRPG: Event Data Augmentation with Relevance Propagation GuidanceCode1
Show:102550
← PrevPage 7 of 168Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DeiT-B (+MixPro)Accuracy (%)82.9Unverified
2ResNet-200 (DeepAA)Accuracy (%)81.32Unverified
3DeiT-S (+MixPro)Accuracy (%)81.3Unverified
4ResNet-200 (Fast AA)Accuracy (%)80.6Unverified
5ResNet-200 (UA)Accuracy (%)80.4Unverified
6ResNet-200 (AA)Accuracy (%)80Unverified
7ResNet-50 (DeepAA)Accuracy (%)78.3Unverified
8ResNet-50 (TA wide)Accuracy (%)78.07Unverified
9ResNet-50 (LoRot-E)Accuracy (%)77.72Unverified
10ResNet-50 (LoRot-I)Accuracy (%)77.71Unverified
#ModelMetricClaimedVerifiedStatus
1WideResNet-40-2 (Faster AA)Percentage error3.7Unverified
2Shake-Shake (26 2×32d) (Faster AA)Percentage error2.7Unverified
3WideResNet-28-10 (Faster AA)Percentage error2.6Unverified
4Shake-Shake (26 2×112d) (Faster AA)Percentage error2Unverified
5Shake-Shake (26 2×96d) (Faster AA)Percentage error2Unverified
#ModelMetricClaimedVerifiedStatus
1DiffAugClassification Accuracy92.7Unverified
2PaCMAPClassification Accuracy85.3Unverified
3hNNEClassification Accuracy77.4Unverified
4TopoAEClassification Accuracy74.6Unverified