SOTAVerified

Synthetic Data Generation

The generation of tabular data by any means possible.

Papers

Showing 101150 of 822 papers

TitleStatusHype
Improved Training of Wasserstein GANsCode1
Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language ModelsCode1
BhasaAnuvaad: A Speech Translation Dataset for 13 Indian LanguagesCode1
AnthroNet: Conditional Generation of Humans via AnthropometricsCode1
MarkushGrapher: Joint Visual and Textual Recognition of Markush StructuresCode1
Black-Box Attacks on Sequential Recommenders via Data-Free Model ExtractionCode1
FinDiff: Diffusion Models for Financial Tabular Data GenerationCode1
Data-Free Knowledge Distillation via Feature Exchange and Activation Region ConstraintCode1
FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop DataCode1
SYNC: A Copula based Framework for Generating Synthetic Data from Aggregated SourcesCode1
GECTurk: Grammatical Error Correction and Detection Dataset for TurkishCode1
Synthetic Data-based Detection of Zebras in Drone ImageryCode1
Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information ExtractionCode1
Synthetic Data Generation Framework, Dataset, and Efficient Deep Model for Pedestrian Intention PredictionCode1
DeepNAG: Deep Non-Adversarial Gesture GenerationCode1
Tabular Transformers for Modeling Multivariate Time SeriesCode1
RAGSynth: Synthetic Data for Robust and Faithful RAG Component OptimizationCode1
Towards Ball Spin and Trajectory Analysis in Table Tennis Broadcast Videos via Physically Grounded Synthetic-to-Real TransferCode1
EPIC: Effective Prompting for Imbalanced-Class Data Synthesis in Tabular Data Classification via Large Language ModelsCode1
AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic DataCode1
Generalizing electrocardiogram delineation -- Training convolutional neural networks with synthetic data augmentationCode1
DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM GuardrailsCode1
Diffusion-based Conditional ECG Generation with Structured State Space ModelsCode1
EC-GAN: Low-Sample Classification using Semi-Supervised Algorithms and GANsCode1
UnrealROX+: An Improved Tool for Acquiring Synthetic Data from Virtual 3D EnvironmentsCode1
Using matrix-product states for time-series machine learningCode1
Characterization and Greedy Learning of Gaussian Structural Causal Models under Unknown InterventionsCode1
DFNet: Enhance Absolute Pose Regression with Direct Feature MatchingCode1
DP-MERF: Differentially Private Mean Embeddings with Random Features for Practical Privacy-Preserving Data GenerationCode1
Controllable 3D Generative Adversarial Face Model via Disentangling Shape and AppearanceCode1
A Comprehensive Survey of Synthetic Tabular Data GenerationCode1
Diffusion-HPC: Synthetic Data Generation for Human Mesh Recovery in Challenging DomainsCode1
dpart: Differentially Private Autoregressive Tabular, a General Framework for Synthetic Data GenerationCode1
dpmm: Differentially Private Marginal Models, a Library for Synthetic Tabular Data GenerationCode1
CorGAN: Correlation-Capturing Convolutional Generative Adversarial Networks for Generating Synthetic Healthcare RecordsCode1
ConText-CIR: Learning from Concepts in Text for Composed Image RetrievalCode1
CLIPPER: Compression enables long-context synthetic data generationCode1
Enhanced Sound Event Localization and Detection in Real 360-degree audio-visual soundscapesCode1
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and RefinementCode1
Exploring Transformer Text Generation for Medical Dataset AugmentationCode1
Differentially Private Synthetic Medical Data Generation using Convolutional GANsCode1
EEG Synthetic Data Generation Using Probabilistic Diffusion ModelsCode1
GenerateCT: Text-Conditional Generation of 3D Chest CT VolumesCode1
Copula-based synthetic data augmentation for machine-learning emulatorsCode1
BLEUBERI: BLEU is a surprisingly effective reward for instruction followingCode1
Generating Synthetic Handwritten Historical Documents With OCR Constrained GANsCode1
MEDIBENG WHISPER TINY: A FINE-TUNED CODE-SWITCHED BENGALI-ENGLISH TRANSLATOR FOR CLINICAL APPLICATIONSCode1
GeoPointGAN: Synthetic Spatial Data with Local Label Differential PrivacyCode1
Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to Parcel LogisticsCode1
Will we run out of data? Limits of LLM scaling based on human-generated dataCode1
Show:102550
← PrevPage 3 of 17Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1corGANAUROC0.92Unverified
2GANAUROC0.87Unverified
#ModelMetricClaimedVerifiedStatus
1kiNETGANEMD0.07Unverified
2CTGANEMD0.07Unverified