Data Augmentation

Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting.

Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3926–3950 of 8378 papers

Title	Date	Tasks	Status	Hype
Deep-OCTA: Ensemble Deep Learning Approaches for Diabetic Retinopathy Analysis on OCTA Images	Oct 2, 2022	Data AugmentationImage Quality Assessment	CodeCode Available	1
Pseudo-Label Generation and Various Data Augmentation for Semi-Supervised Hyperspectral Object Detection	Oct 1, 2022	Data Augmentationobject-detection	CodeCode Available	0
Probing the Robustness of Pre-trained Language Models for Entity Matching	Oct 1, 2022	Data AugmentationDomain Generalization	CodeCode Available	0
Augmented Bio-SBERT: Improving Performance for Pairwise Sentence Tasks in Bio-medical Domain	Oct 1, 2022	Data AugmentationSentence	—Unverified	0
KUL@SMM4H’22: Template Augmented Adaptive Pre-training for Tweet Classification	Oct 1, 2022	Data AugmentationLanguage Modeling	—Unverified	0
Low-Resource Neural Machine Translation: A Case Study of Cantonese	Oct 1, 2022	Data AugmentationLow Resource Neural Machine Translation	CodeCode Available	1
The Only Chance to Understand: Machine Translation of the Severely Endangered Low-resource Languages of Eurasia	Oct 1, 2022	Data AugmentationLanguage Modeling	—Unverified	0
CAISA@SMM4H’22: Robust Cross-Lingual Detection of Disease Mentions on Social Media with Adversarial Methods	Oct 1, 2022	Data Augmentation	—Unverified	0
Lightweight Contextual Logical Structure Recovery	Oct 1, 2022	ArticlesData Augmentation	—Unverified	0
BioInfo@UAVR@SMM4H’22: Classification and Extraction of Adverse Event mentions in Tweets using Transformer Models	Oct 1, 2022	Data Augmentation	—Unverified	0
Data Augmentation for Improving the Prediction of Validity and Novelty of Argumentative Conclusions	Oct 1, 2022	Data Augmentation	—Unverified	0
Data Augmentation for Few-Shot Knowledge Graph Completion from Hierarchical Perspective	Oct 1, 2022	Data AugmentationKnowledge Graph Completion	—Unverified	0
Coordination Generation via Synchronized Text-Infilling	Oct 1, 2022	Data AugmentationSentence	—Unverified	0
ParaZh-22M: A Large-Scale Chinese Parabank via Machine Translation	Oct 1, 2022	Data AugmentationMachine Translation	—Unverified	0
Rethinking Data Augmentation in Text-to-text Paradigm	Oct 1, 2022	Data Augmentation	—Unverified	0
Unsupervised Data Augmentation for Aspect Based Sentiment Analysis	Oct 1, 2022	Aspect-Based Sentiment AnalysisAspect-Based Sentiment Analysis (ABSA)	—Unverified	0
Evaluating and Mitigating Inherent Linguistic Bias of African American English through Inference	Oct 1, 2022	Data AugmentationDiversity	—Unverified	0
Enhancing Task-Specific Distillation in Small Data Regimes through Language Generation	Oct 1, 2022	Data AugmentationMRPC	—Unverified	0
Table-based Fact Verification with Self-labeled Keypoint Alignment	Oct 1, 2022	AttributeContrastive Learning	—Unverified	0
Towards Summarizing Healthcare Questions in Low-Resource Setting	Oct 1, 2022	Data AugmentationDiversity	—Unverified	0
Effective Data Augmentation for Sentence Classification Using One VAE per Class	Oct 1, 2022	Binary ClassificationData Augmentation	—Unverified	0
Dynamic Nonlinear Mixup with Distance-based Sample Selection	Oct 1, 2022	Data Augmentation	—Unverified	0
Document-level Event Factuality Identification via Machine Reading Comprehension Frameworks with Transfer Learning	Oct 1, 2022	Data AugmentationMachine Reading Comprehension	—Unverified	0
BRCC and SentiBahasaRojak: The First Bahasa Rojak Corpus for Pretraining and Sentiment Analysis Dataset	Oct 1, 2022	Data AugmentationSentiment Analysis	—Unverified	0
Summarizing Patients’ Problems from Hospital Progress Notes Using Pre-trained Sequence-to-Sequence Models	Oct 1, 2022	Data AugmentationDiagnostic	—Unverified	0

Show:10 25 50

← PrevPage 158 of 336Next →

All datasets ImageNet CIFAR-10 GA1457

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	DeiT-B (+MixPro)	Accuracy (%)	82.9	—	Unverified
2	ResNet-200 (DeepAA)	Accuracy (%)	81.32	—	Unverified
3	DeiT-S (+MixPro)	Accuracy (%)	81.3	—	Unverified
4	ResNet-200 (Fast AA)	Accuracy (%)	80.6	—	Unverified
5	ResNet-200 (UA)	Accuracy (%)	80.4	—	Unverified
6	ResNet-200 (AA)	Accuracy (%)	80	—	Unverified
7	ResNet-50 (DeepAA)	Accuracy (%)	78.3	—	Unverified
8	ResNet-50 (TA wide)	Accuracy (%)	78.07	—	Unverified
9	ResNet-50 (LoRot-E)	Accuracy (%)	77.72	—	Unverified
10	ResNet-50 (LoRot-I)	Accuracy (%)	77.71	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WideResNet-40-2 (Faster AA)	Percentage error	3.7	—	Unverified
2	Shake-Shake (26 2×32d) (Faster AA)	Percentage error	2.7	—	Unverified
3	WideResNet-28-10 (Faster AA)	Percentage error	2.6	—	Unverified
4	Shake-Shake (26 2×96d) (Faster AA)	Percentage error	2	—	Unverified
5	Shake-Shake (26 2×112d) (Faster AA)	Percentage error	2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DiffAug	Classification Accuracy	92.7	—	Unverified
2	PaCMAP	Classification Accuracy	85.3	—	Unverified
3	hNNE	Classification Accuracy	77.4	—	Unverified
4	TopoAE	Classification Accuracy	74.6	—	Unverified