SOTAVerified
|
Agents
Browse
Leaderboard
About
Tasks
›
Synthetic Data Generation
Synthetic Data Generation
The generation of tabular data by any means possible.
Papers
Recently Added
Most Hyped
Most Active
Needs Verification
Most Verified
Showing 51–75 of 822 papers
Title
Date
Tasks
Status
Hype
MarkushGrapher: Joint Visual and Textual Recognition of Markush Structures
Mar 20, 2025
Synthetic Data Generation
Code
Code Available
1
AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data
Mar 7, 2025
Diversity
Fairness
Code
Code Available
1
CLIPPER: Compression enables long-context synthetic data generation
Feb 20, 2025
Claim Verification
Synthetic Data Generation
Code
Code Available
1
DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails
Feb 7, 2025
Reinforcement Learning (RL)
Synthetic Data Generation
Code
Code Available
1
SimBEV: A Synthetic Multi-Task Multi-Sensor Driving Data Generation Tool and Dataset
Feb 4, 2025
3D Object Detection
Autonomous Driving
Code
Code Available
1
XRF V2: A Dataset for Action Summarization with Wi-Fi Signals, and IMUs in Phones, Watches, Earbuds, and Glasses
Jan 31, 2025
Action Localization
Action Recognition
Code
Code Available
1
FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data
Jan 28, 2025
Natural Language Inference
Synthetic Data Generation
Code
Code Available
1
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement
Jan 21, 2025
Synthetic Data Generation
World Knowledge
Code
Code Available
1
Synthetic Data Generation by Supervised Neural Gas Network for Physiological Emotion Recognition Data
Jan 19, 2025
EEG
Emotion Recognition
Code
Code Available
1
Generating Traffic Scenarios via In-Context Learning to Learn Better Motion Planner
Dec 24, 2024
Autonomous Driving
Dataset Generation
Code
Code Available
1
Using matrix-product states for time-series machine learning
Dec 20, 2024
Astronomy
Imputation
Code
Code Available
1
ResoFilter: Fine-grained Synthetic Data Filtering for Large Language Models through Data-Parameter Resonance Analysis
Dec 19, 2024
Data Augmentation
Synthetic Data Generation
Code
Code Available
1
SimuScope: Realistic Endoscopic Synthetic Dataset Generation through Surgical Simulation and Diffusion Models
Dec 3, 2024
Dataset Generation
Image-to-Image Translation
Code
Code Available
1
Seed-Free Synthetic Data Generation Framework for Instruction-Tuning LLMs: A Case Study in Thai
Nov 23, 2024
Diversity
Question Answering
Code
Code Available
1
BhasaAnuvaad: A Speech Translation Dataset for 13 Indian Languages
Nov 7, 2024
automatic-speech-translation
Synthetic Data Generation
Code
Code Available
1
SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Classification
Oct 7, 2024
image-classification
Image Classification
Code
Code Available
1
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
Oct 3, 2024
HumanEval
Synthetic Data Generation
Code
Code Available
1
Voice Disorder Analysis: a Transformer-based Approach
Jun 20, 2024
Data Augmentation
Diversity
Code
Code Available
1
SynthesizRR: Generating Diverse Datasets with Retrieval Augmentation
May 16, 2024
Bias Detection
Diversity
Code
Code Available
1
Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models
Apr 23, 2024
Conversational Question Answering
Dialogue State Tracking
Code
Code Available
1
EPIC: Effective Prompting for Imbalanced-Class Data Synthesis in Tabular Data Classification via Large Language Models
Apr 15, 2024
In-Context Learning
Synthetic Data Generation
Code
Code Available
1
An evaluation framework for synthetic data generation models
Apr 13, 2024
Data Augmentation
Synthetic Data Generation
Code
Code Available
1
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs
Feb 23, 2024
Benchmarking
slot-filling
Code
Code Available
1
Enhanced Sound Event Localization and Detection in Real 360-degree audio-visual soundscapes
Jan 29, 2024
Data Augmentation
Sound Event Localization and Detection
Code
Code Available
1
Synthetic Data Generation Framework, Dataset, and Efficient Deep Model for Pedestrian Intention Prediction
Jan 12, 2024
Autonomous Driving
Prediction
Code
Code Available
1
Show:
10
25
50
← Prev
Page 3 of 33
Next →
All datasets
UCI Epileptic Seizure Recognition
UNSW-NB15
Benchmark Results
▼
UCI Epileptic Seizure Recognition
2 submissions
↑ higher is better
#
Model
Metric
Claimed
Verified
Status
1
corGAN
AUROC
0.92
—
Unverified
2
GAN
AUROC
0.87
—
Unverified
▼
UNSW-NB15
2 submissions
↑ higher is better
#
Model
Metric
Claimed
Verified
Status
1
kiNETGAN
EMD
0.07
—
Unverified
2
CTGAN
EMD
0.07
—
Unverified