Data Augmentation

Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting.

Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3901–3950 of 8378 papers

Title	Date	Tasks	Status	Hype
In What Ways Are Deep Neural Networks Invariant and How Should We Measure This?	Oct 7, 2022	Data AugmentationDeep Learning	—Unverified	0
Evaluating the Performance of StyleGAN2-ADA on Medical Images	Oct 7, 2022	Computed Tomography (CT)Data Augmentation	—Unverified	0
UU-Tax at SemEval-2022 Task 3: Improving the generalizability of language models for taxonomy classification through data augmentation	Oct 7, 2022	Binary ClassificationData Augmentation	CodeCode Available	0
Trustworthiness of Laser-Induced Breakdown Spectroscopy Predictions via Simulation-based Synthetic Data Augmentation and Multitask Learning	Oct 7, 2022	Data AugmentationDimensionality Reduction	—Unverified	0
Automated segmentation and morphological characterization of placental histology images based on a single labeled image	Oct 7, 2022	Data AugmentationDiversity	CodeCode Available	1
On the Effectiveness of Hybrid Pooling in Mixup-Based Graph Learning for Language Processing	Oct 6, 2022	Code ClassificationData Augmentation	CodeCode Available	0
A ResNet is All You Need? Modeling A Strong Baseline for Detecting Referable Diabetic Retinopathy in Fundus Images	Oct 6, 2022	AllData Augmentation	CodeCode Available	0
BootAug: Boosting Text Augmentation via Hybrid Instance Filtering Framework	Oct 6, 2022	ClassificationData Augmentation	CodeCode Available	1
MIXCODE: Enhancing Code Classification by Mixup-Based Data Augmentation	Oct 6, 2022	ClassificationCode Classification	CodeCode Available	1
Data-driven Approaches to Surrogate Machine Learning Model Development	Oct 6, 2022	Data AugmentationTransfer Learning	—Unverified	0
Data Augmentation-free Unsupervised Learning for 3D Point Cloud Understanding	Oct 6, 2022	3D Object ClassificationContrastive Learning	CodeCode Available	1
GT-GAN: General Purpose Time Series Synthesis with Generative Adversarial Networks	Oct 5, 2022	Data AugmentationGenerative Adversarial Network	—Unverified	0
TC-SKNet with GridMask for Low-complexity Classification of Acoustic scene	Oct 5, 2022	AutoMLData Augmentation	—Unverified	0
The Vendi Score: A Diversity Evaluation Metric for Machine Learning	Oct 5, 2022	Data AugmentationDiversity	CodeCode Available	1
The Calibration Generalization Gap	Oct 5, 2022	Data Augmentation	CodeCode Available	1
Transformer-based conditional generative adversarial network for multivariate time series generation	Oct 5, 2022	Data AugmentationGenerative Adversarial Network	CodeCode Available	1
Adaptively Weighted Data Augmentation Consistency Regularization for Robust Optimization under Concept Shift	Oct 4, 2022	Data AugmentationImage Segmentation	—Unverified	0
Nuisances via Negativa: Adjusting for Spurious Correlations via Data Augmentation	Oct 4, 2022	Data AugmentationNatural Language Inference	—Unverified	0
Code-Switching without Switching: Language Agnostic End-to-End Speech Translation	Oct 4, 2022	Data Augmentationspeech-recognition	—Unverified	0
PlaneDepth: Self-supervised Depth Estimation via Orthogonal Planes	Oct 4, 2022	Autonomous DrivingData Augmentation	CodeCode Available	1
rPPG-Toolbox: Deep Remote PPG Toolbox	Oct 3, 2022	BenchmarkingData Augmentation	CodeCode Available	2
MultiLoad-GAN: A GAN-Based Synthetic Load Group Generation Method Considering Spatial-Temporal Correlations	Oct 3, 2022	Data AugmentationGenerative Adversarial Network	—Unverified	0
Mutual Information Learned Classifiers: an Information-theoretic Viewpoint of Training Deep Learning Classification Systems	Oct 3, 2022	Binary ClassificationData Augmentation	—Unverified	0
Random Data Augmentation based Enhancement: A Generalized Enhancement Approach for Medical Datasets	Oct 3, 2022	Data AugmentationDiagnostic	CodeCode Available	0
Smooth image-to-image translations with latent space interpolations	Oct 3, 2022	Data AugmentationInductive Bias	CodeCode Available	0
Deep-OCTA: Ensemble Deep Learning Approaches for Diabetic Retinopathy Analysis on OCTA Images	Oct 2, 2022	Data AugmentationImage Quality Assessment	CodeCode Available	1
Pseudo-Label Generation and Various Data Augmentation for Semi-Supervised Hyperspectral Object Detection	Oct 1, 2022	Data Augmentationobject-detection	CodeCode Available	0
Probing the Robustness of Pre-trained Language Models for Entity Matching	Oct 1, 2022	Data AugmentationDomain Generalization	CodeCode Available	0
Augmented Bio-SBERT: Improving Performance for Pairwise Sentence Tasks in Bio-medical Domain	Oct 1, 2022	Data AugmentationSentence	—Unverified	0
KUL@SMM4H’22: Template Augmented Adaptive Pre-training for Tweet Classification	Oct 1, 2022	Data AugmentationLanguage Modeling	—Unverified	0
Low-Resource Neural Machine Translation: A Case Study of Cantonese	Oct 1, 2022	Data AugmentationLow Resource Neural Machine Translation	CodeCode Available	1
The Only Chance to Understand: Machine Translation of the Severely Endangered Low-resource Languages of Eurasia	Oct 1, 2022	Data AugmentationLanguage Modeling	—Unverified	0
CAISA@SMM4H’22: Robust Cross-Lingual Detection of Disease Mentions on Social Media with Adversarial Methods	Oct 1, 2022	Data Augmentation	—Unverified	0
Lightweight Contextual Logical Structure Recovery	Oct 1, 2022	ArticlesData Augmentation	—Unverified	0
BioInfo@UAVR@SMM4H’22: Classification and Extraction of Adverse Event mentions in Tweets using Transformer Models	Oct 1, 2022	Data Augmentation	—Unverified	0
Data Augmentation for Improving the Prediction of Validity and Novelty of Argumentative Conclusions	Oct 1, 2022	Data Augmentation	—Unverified	0
Data Augmentation for Few-Shot Knowledge Graph Completion from Hierarchical Perspective	Oct 1, 2022	Data AugmentationKnowledge Graph Completion	—Unverified	0
Coordination Generation via Synchronized Text-Infilling	Oct 1, 2022	Data AugmentationSentence	—Unverified	0
ParaZh-22M: A Large-Scale Chinese Parabank via Machine Translation	Oct 1, 2022	Data AugmentationMachine Translation	—Unverified	0
Rethinking Data Augmentation in Text-to-text Paradigm	Oct 1, 2022	Data Augmentation	—Unverified	0
Unsupervised Data Augmentation for Aspect Based Sentiment Analysis	Oct 1, 2022	Aspect-Based Sentiment AnalysisAspect-Based Sentiment Analysis (ABSA)	—Unverified	0
Evaluating and Mitigating Inherent Linguistic Bias of African American English through Inference	Oct 1, 2022	Data AugmentationDiversity	—Unverified	0
Enhancing Task-Specific Distillation in Small Data Regimes through Language Generation	Oct 1, 2022	Data AugmentationMRPC	—Unverified	0
Table-based Fact Verification with Self-labeled Keypoint Alignment	Oct 1, 2022	AttributeContrastive Learning	—Unverified	0
Towards Summarizing Healthcare Questions in Low-Resource Setting	Oct 1, 2022	Data AugmentationDiversity	—Unverified	0
Effective Data Augmentation for Sentence Classification Using One VAE per Class	Oct 1, 2022	Binary ClassificationData Augmentation	—Unverified	0
Dynamic Nonlinear Mixup with Distance-based Sample Selection	Oct 1, 2022	Data Augmentation	—Unverified	0
Document-level Event Factuality Identification via Machine Reading Comprehension Frameworks with Transfer Learning	Oct 1, 2022	Data AugmentationMachine Reading Comprehension	—Unverified	0
BRCC and SentiBahasaRojak: The First Bahasa Rojak Corpus for Pretraining and Sentiment Analysis Dataset	Oct 1, 2022	Data AugmentationSentiment Analysis	—Unverified	0
Summarizing Patients’ Problems from Hospital Progress Notes Using Pre-trained Sequence-to-Sequence Models	Oct 1, 2022	Data AugmentationDiagnostic	—Unverified	0

Show:10 25 50

← PrevPage 79 of 168Next →

All datasets ImageNet CIFAR-10 GA1457

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	DeiT-B (+MixPro)	Accuracy (%)	82.9	—	Unverified
2	ResNet-200 (DeepAA)	Accuracy (%)	81.32	—	Unverified
3	DeiT-S (+MixPro)	Accuracy (%)	81.3	—	Unverified
4	ResNet-200 (Fast AA)	Accuracy (%)	80.6	—	Unverified
5	ResNet-200 (UA)	Accuracy (%)	80.4	—	Unverified
6	ResNet-200 (AA)	Accuracy (%)	80	—	Unverified
7	ResNet-50 (DeepAA)	Accuracy (%)	78.3	—	Unverified
8	ResNet-50 (TA wide)	Accuracy (%)	78.07	—	Unverified
9	ResNet-50 (LoRot-E)	Accuracy (%)	77.72	—	Unverified
10	ResNet-50 (LoRot-I)	Accuracy (%)	77.71	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WideResNet-40-2 (Faster AA)	Percentage error	3.7	—	Unverified
2	Shake-Shake (26 2×32d) (Faster AA)	Percentage error	2.7	—	Unverified
3	WideResNet-28-10 (Faster AA)	Percentage error	2.6	—	Unverified
4	Shake-Shake (26 2×112d) (Faster AA)	Percentage error	2	—	Unverified
5	Shake-Shake (26 2×96d) (Faster AA)	Percentage error	2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DiffAug	Classification Accuracy	92.7	—	Unverified
2	PaCMAP	Classification Accuracy	85.3	—	Unverified
3	hNNE	Classification Accuracy	77.4	—	Unverified
4	TopoAE	Classification Accuracy	74.6	—	Unverified