SOTAVerified

Synthetic Data Generation

The generation of tabular data by any means possible.

Papers

Showing 51100 of 822 papers

TitleStatusHype
MarkushGrapher: Joint Visual and Textual Recognition of Markush StructuresCode1
AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic DataCode1
CLIPPER: Compression enables long-context synthetic data generationCode1
DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM GuardrailsCode1
SimBEV: A Synthetic Multi-Task Multi-Sensor Driving Data Generation Tool and DatasetCode1
XRF V2: A Dataset for Action Summarization with Wi-Fi Signals, and IMUs in Phones, Watches, Earbuds, and GlassesCode1
FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop DataCode1
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and RefinementCode1
Synthetic Data Generation by Supervised Neural Gas Network for Physiological Emotion Recognition DataCode1
Generating Traffic Scenarios via In-Context Learning to Learn Better Motion PlannerCode1
Using matrix-product states for time-series machine learningCode1
ResoFilter: Fine-grained Synthetic Data Filtering for Large Language Models through Data-Parameter Resonance AnalysisCode1
SimuScope: Realistic Endoscopic Synthetic Dataset Generation through Surgical Simulation and Diffusion ModelsCode1
Seed-Free Synthetic Data Generation Framework for Instruction-Tuning LLMs: A Case Study in ThaiCode1
BhasaAnuvaad: A Speech Translation Dataset for 13 Indian LanguagesCode1
SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image ClassificationCode1
Training Language Models on Synthetic Edit Sequences Improves Code SynthesisCode1
Voice Disorder Analysis: a Transformer-based ApproachCode1
SynthesizRR: Generating Diverse Datasets with Retrieval AugmentationCode1
Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language ModelsCode1
EPIC: Effective Prompting for Imbalanced-Class Data Synthesis in Tabular Data Classification via Large Language ModelsCode1
An evaluation framework for synthetic data generation modelsCode1
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMsCode1
Enhanced Sound Event Localization and Detection in Real 360-degree audio-visual soundscapesCode1
Synthetic Data Generation Framework, Dataset, and Efficient Deep Model for Pedestrian Intention PredictionCode1
RetailSynth: Synthetic Data Generation for Retail AI Systems EvaluationCode1
View-Dependent Octree-based Mesh Extraction in Unbounded Scenes for Procedural Synthetic DataCode1
D3A-TS: Denoising-Driven Data Augmentation in Time SeriesCode1
AutoDiff: combining Auto-encoder and Diffusion model for tabular data synthesizingCode1
GECTurk: Grammatical Error Correction and Detection Dataset for TurkishCode1
LEyes: A Lightweight Framework for Deep Learning-Based Eye Tracking using Synthetic Eye ImagesCode1
AnthroNet: Conditional Generation of Humans via AnthropometricsCode1
FinDiff: Diffusion Models for Financial Tabular Data GenerationCode1
Generating tabular datasets under differential privacyCode1
POV-Surgery: A Dataset for Egocentric Hand and Tool Pose Estimation During Surgical ActivitiesCode1
SynTable: A Synthetic Data Generation Pipeline for Unseen Object Amodal Instance Segmentation of Cluttered Tabletop ScenesCode1
PyTrial: Machine Learning Software and Benchmark for Clinical Trial ApplicationsCode1
GenerateCT: Text-Conditional Generation of 3D Chest CT VolumesCode1
Leveraging Generative AI Models for Synthetic Data Generation in Healthcare: Balancing Research and PrivacyCode1
Learning from synthetic data generated with GRADECode1
Synthetic Data-based Detection of Zebras in Drone ImageryCode1
SocialDial: A Benchmark for Socially-Aware Dialogue SystemsCode1
Natural Language-Based Synthetic Data Generation for Cluster AnalysisCode1
Diffusion-HPC: Synthetic Data Generation for Human Mesh Recovery in Challenging DomainsCode1
Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information ExtractionCode1
EEG Synthetic Data Generation Using Probabilistic Diffusion ModelsCode1
Generating Multidimensional Clusters With Support LinesCode1
Diffusion-based Conditional ECG Generation with Structured State Space ModelsCode1
Boosting Synthetic Data Generation with Effective Nonlinear Causal DiscoveryCode1
Data-Free Knowledge Distillation via Feature Exchange and Activation Region ConstraintCode1
Show:102550
← PrevPage 2 of 17Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1corGANAUROC0.92Unverified
2GANAUROC0.87Unverified
#ModelMetricClaimedVerifiedStatus
1kiNETGANEMD0.07Unverified
2CTGANEMD0.07Unverified