SOTAVerified

Dataset Generation

The task involves enhancing the training of target application (e.g. autonomous driving systems) by generating datasets of diverse and critical elements (e.g. traffic scenarios). Traditional methods rely on expensive and limited datasets, which often fail to capture rare but essential situations that can pose risks during testing.

Papers

Showing 101125 of 308 papers

TitleStatusHype
CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and AugmentationCode1
DiffAge3D: Diffusion-based 3D-aware Face Aging0
PADetBench: Towards Benchmarking Physical Attacks against Object DetectionCode1
CyberPal.AI: Empowering LLMs with Expert-Driven Cybersecurity Instructions0
Fire Dynamic Vision: Image Segmentation and Tracking for Multi-Scale Fire and Plume BehaviorCode0
The Fellowship of the LLMs: Multi-Agent Workflows for Synthetic Preference Optimization Dataset GenerationCode0
A systematic dataset generation technique applied to data-driven automotive aerodynamics0
Neural Network Surrogate and Projected Gradient Descent for Fast and Reliable Finite Element Model Calibration: a Case Study on an Intervertebral DiscCode0
Full-range Head Pose Geometric Data Augmentations0
RAGEval: Scenario Specific RAG Evaluation Dataset Generation FrameworkCode3
On Deep Learning for computing the Dynamic Initial Margin and Margin Value Adjustment0
AutoVCoder: A Systematic Framework for Automated Verilog Code Generation using LLMs0
Realistic Surgical Image Dataset Generation Based On 3D Gaussian Splatting0
LSD3K: A Benchmark for Smoke Removal from Laparoscopic Surgery Images0
Close the Sim2real Gap via Physically-based Structured Light Synthetic Data Simulation0
Chip Placement with Diffusion ModelsCode1
DataDream: Few-shot Guided Dataset GenerationCode2
JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation0
Rethinking the Effectiveness of Graph Classification Datasets in Benchmarks for Assessing GNNsCode0
Synthetic data: How could it be used for infectious disease research?0
Semantically Rich Local Dataset Generation for Explainable AI in GenomicsCode0
TheoremLlama: Transforming General-Purpose LLMs into Lean4 ExpertsCode1
Channel Modeling Aided Dataset Generation for AI-Enabled CSI Feedback: Advances, Challenges, and Solutions0
UniGen: A Unified Framework for Textual Dataset Generation Using Large Language ModelsCode2
DataFreeShield: Defending Adversarial Attacks without Training Data0
Show:102550
← PrevPage 5 of 13Next →

No leaderboard results yet.