SOTAVerified

Data Augmentation

Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting.

Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others.

Further readings:

( Image credit: Albumentations )

Papers

Showing 24012450 of 8378 papers

TitleStatusHype
Dialect Adaptation and Data Augmentation for Low-Resource ASR: TalTech Systems for the MADASR 2023 Challenge0
PAC-tuning:Fine-tuning Pretrained Language Models with PAC-driven Perturbed Gradient Descent0
Using GPT-4 to Augment Unbalanced Data for Automatic Scoring0
Early Detection of Tuberculosis with Machine Learning Cough Audio Analysis: Towards More Accessible Global Triaging Usage0
Data Augmentation for Emotion Detection in Small Imbalanced Text DataCode0
Transferring a molecular foundation model for polymer property predictions0
Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data AugmentationCode0
Improving Few-shot Generalization of Safety Classifiers via Data Augmented Parameter-Efficient Fine-Tuning0
An Explainable Deep Learning-Based Method For Schizophrenia Diagnosis Using Generative Data-Augmentation0
UAV-Sim: NeRF-based Synthetic Data Generation for UAV-based Perception0
Data Optimization in Deep Learning: A SurveyCode1
DualMatch: Robust Semi-Supervised Learning with Dual-Level InteractionCode0
DALE: Generative Data Augmentation for Low-Resource Legal NLPCode1
Improving Robustness and Reliability in Medical Image Classification with Latent-Guided Diffusion and Nested-Ensembles0
Improving Language Models Meaning Understanding and Consistency by Learning Conceptual Roles from Dictionary0
G2-MonoDepth: A General Framework of Generalized Depth Inference from Monocular RGB+X DataCode1
Towards contrast-agnostic soft segmentation of the spinal cordCode0
Vicinal Feature Statistics Augmentation for Federated 3D Medical Volume Segmentation0
Data Augmentation Techniques for Machine Translation of Code-Switched Texts: A Comparative Study0
Statistical Depth for Ranking and Characterizing Transformer-Based Text EmbeddingsCode0
CalibrationPhys: Self-supervised Video-based Heart and Respiratory Rate Measurements by Calibrating Between Multiple Cameras0
GRLib: An Open-Source Hand Gesture Detection and Recognition Python LibraryCode1
S3Aug: Segmentation, Sampling, and Shift for Action Recognition0
EXPLAIN, EDIT, GENERATE: Rationale-Sensitive Counterfactual Data Augmentation for Multi-hop Fact VerificationCode0
Data Augmentation: a Combined Inductive-Deductive Approach featuring Answer Set Programming0
Diffusion-based Data Augmentation for Nuclei Image SegmentationCode1
Intent Contrastive Learning with Cross Subsequences for Sequential RecommendationCode1
Text generation for dataset augmentation in security classification tasksCode1
PromptMix: A Class Boundary Augmentation Method for Large Language Model DistillationCode1
Toward Generative Data Augmentation for Traffic Classification0
Filling the Missing: Exploring Generative AI for Enhanced Federated Learning over Heterogeneous Mobile Edge Devices0
Data-Free Knowledge Distillation Using Adversarially Perturbed OpenGL Shader Images0
DIG-MILP: a Deep Instance Generator for Mixed-Integer Linear Programming with Feasibility GuaranteeCode1
A Quality-based Syntactic Template Retriever for Syntactically-controlled Paraphrase GenerationCode0
GraphGPT: Graph Instruction Tuning for Large Language ModelsCode2
A Car Model Identification System for Streamlining the Automobile Sales Process0
A Distributed Approach to Meteorological Predictions: Addressing Data Imbalance in Precipitation Prediction Models through Federated Learning and GANs0
Unsupervised Candidate Answer Extraction through Differentiable Masker-Reconstructor Model0
Data Augmentations for Improved (Large) Language Model Generalization0
OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution ShiftCode1
EmoDiarize: Speaker Diarization and Emotion Identification from Speech Signals using Convolutional Neural Networks0
DiffBoost: Enhancing Medical Image Segmentation via Text-Guided Diffusion ModelCode1
AUC-mixup: Deep AUC Maximization with Mixup0
MixEdit: Revisiting Data Augmentation and Beyond for Grammatical Error CorrectionCode1
CLARA: Multilingual Contrastive Learning for Audio Representation AcquisitionCode1
Enhancing Spoofing Speech Detection Using Rhythm Information0
DASA: Difficulty-Aware Semantic Augmentation for Speaker Verification0
Panoptic Out-of-Distribution Segmentation0
ChapGTP, ILLC's Attempt at Raising a BabyLM: Improving Data Efficiency by Automatic Task Formation0
Self-supervision meets kernel graph neural models: From architecture to augmentations0
Show:102550
← PrevPage 49 of 168Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DeiT-B (+MixPro)Accuracy (%)82.9Unverified
2ResNet-200 (DeepAA)Accuracy (%)81.32Unverified
3DeiT-S (+MixPro)Accuracy (%)81.3Unverified
4ResNet-200 (Fast AA)Accuracy (%)80.6Unverified
5ResNet-200 (UA)Accuracy (%)80.4Unverified
6ResNet-200 (AA)Accuracy (%)80Unverified
7ResNet-50 (DeepAA)Accuracy (%)78.3Unverified
8ResNet-50 (TA wide)Accuracy (%)78.07Unverified
9ResNet-50 (LoRot-E)Accuracy (%)77.72Unverified
10ResNet-50 (LoRot-I)Accuracy (%)77.71Unverified
#ModelMetricClaimedVerifiedStatus
1WideResNet-40-2 (Faster AA)Percentage error3.7Unverified
2Shake-Shake (26 2×32d) (Faster AA)Percentage error2.7Unverified
3WideResNet-28-10 (Faster AA)Percentage error2.6Unverified
4Shake-Shake (26 2×112d) (Faster AA)Percentage error2Unverified
5Shake-Shake (26 2×96d) (Faster AA)Percentage error2Unverified
#ModelMetricClaimedVerifiedStatus
1DiffAugClassification Accuracy92.7Unverified
2PaCMAPClassification Accuracy85.3Unverified
3hNNEClassification Accuracy77.4Unverified
4TopoAEClassification Accuracy74.6Unverified