SOTAVerified
|
Agents
Browse
Leaderboard
About
Tasks
›
Synthetic Data Generation
Synthetic Data Generation
The generation of tabular data by any means possible.
Papers
Recently Added
Most Hyped
Most Active
Needs Verification
Most Verified
Showing 1–25 of 822 papers
Title
Date
Tasks
Status
Hype
Score
Qwen2.5-Coder Technical Report
Sep 18, 2024
Code Generation
Code
Code Available
11
5
Better Synthetic Data by Retrieving and Transforming Existing Datasets
Apr 22, 2024
Dataset Generation
Diversity
Code
Code Available
7
5
LAB: Large-Scale Alignment for ChatBots
Mar 2, 2024
Instruction Following
Language Modeling
Code
Code Available
5
5
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows
Feb 16, 2024
Synthetic Data Generation
Code
Code Available
5
5
TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models
May 18, 2023
Natural Language Inference
Synthetic Data Generation
Code
Code Available
4
5
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
May 26, 2025
Question Answering
Synthetic Data Generation
Code
Code Available
4
5
Nemotron-4 340B Technical Report
Jun 17, 2024
Synthetic Data Generation
Code
Code Available
4
5
FSID: Fully Synthetic Image Denoising via Procedural Scene Generation
Dec 7, 2022
Denoising
Image Denoising
Code
Code Available
4
5
TabularARGN: A Flexible and Efficient Auto-Regressive Framework for Generating High-Fidelity Synthetic Data
Jan 21, 2025
Fairness
Imputation
Code
Code Available
4
5
MegActor: Harness the Power of Raw Video for Vivid Portrait Animation
May 31, 2024
Portrait Animation
Style Transfer
Code
Code Available
4
5
Annif at SemEval-2025 Task 5: Traditional XMTC augmented by LLMs
Apr 28, 2025
Synthetic Data Generation
Code
Code Available
3
5
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Dec 27, 2024
Diversity
Synthetic Data Generation
Code
Code Available
3
5
Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models
Jun 10, 2025
3D Lane Detection
3D Object Detection
Code
Code Available
3
5
A Survey on Deep Learning for Theorem Proving
Apr 15, 2024
Automated Theorem Proving
Deep Learning
Code
Code Available
3
5
ReasonIR: Training Retrievers for Reasoning Tasks
Apr 29, 2025
Information Retrieval
MMLU
Code
Code Available
3
5
Predict, Refine, Synthesize: Self-Guiding Diffusion Models for Probabilistic Time Series Forecasting
Jul 21, 2023
Imputation
Probabilistic Time Series Forecasting
Code
Code Available
2
5
Mellow: a small audio language model for reasoning
Mar 11, 2025
Audio captioning
Language Modeling
Code
Code Available
2
5
Pedagogical Alignment of Large Language Models
Feb 7, 2024
Synthetic Data Generation
Code
Code Available
2
5
REaLTabFormer: Generating Realistic Relational and Tabular Data using Transformers
Feb 4, 2023
Synthetic Data Generation
Code
Code Available
2
5
Benchmarking Synthetic Tabular Data: A Multi-Dimensional Evaluation Framework
Apr 2, 2025
Benchmarking
Synthetic Data Generation
Code
Code Available
2
5
BEDLAM: A Synthetic Dataset of Bodies Exhibiting Detailed Lifelike Animated Motion
Jun 29, 2023
Synthetic Data Generation
Code
Code Available
2
5
A Synthetic Dataset for Personal Attribute Inference
Jun 11, 2024
Attribute
Author Profiling
Code
Code Available
2
5
Improved Multi-Task Brain Tumour Segmentation with Synthetic Data Augmentation
Nov 7, 2024
Data Augmentation
Synthetic Data Generation
Code
Code Available
2
5
Improving 2D Human Pose Estimation in Rare Camera Views with Synthetic Data
Jul 13, 2023
2D Human Pose Estimation
Pose Estimation
Code
Code Available
2
5
InPars Toolkit: A Unified and Reproducible Synthetic Data Generation Pipeline for Neural Information Retrieval
Jul 10, 2023
GPU
Information Retrieval
Code
Code Available
2
5
Show:
10
25
50
← Prev
Page 1 of 33
Next →
All datasets
UCI Epileptic Seizure Recognition
UNSW-NB15
Benchmark Results
▼
UCI Epileptic Seizure Recognition
2 submissions
↑ higher is better
#
Model
Metric
Claimed
Verified
Status
1
corGAN
AUROC
0.92
—
Unverified
2
GAN
AUROC
0.87
—
Unverified
▼
UNSW-NB15
2 submissions
↑ higher is better
#
Model
Metric
Claimed
Verified
Status
1
kiNETGAN
EMD
0.07
—
Unverified
2
CTGAN
EMD
0.07
—
Unverified