SOTAVerified

Dataset Generation

The task involves enhancing the training of target application (e.g. autonomous driving systems) by generating datasets of diverse and critical elements (e.g. traffic scenarios). Traditional methods rely on expensive and limited datasets, which often fail to capture rare but essential situations that can pose risks during testing.

Papers

Showing 101150 of 308 papers

TitleStatusHype
CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and AugmentationCode1
DiffAge3D: Diffusion-based 3D-aware Face Aging0
PADetBench: Towards Benchmarking Physical Attacks against Object DetectionCode1
CyberPal.AI: Empowering LLMs with Expert-Driven Cybersecurity Instructions0
Fire Dynamic Vision: Image Segmentation and Tracking for Multi-Scale Fire and Plume BehaviorCode0
The Fellowship of the LLMs: Multi-Agent Workflows for Synthetic Preference Optimization Dataset GenerationCode0
A systematic dataset generation technique applied to data-driven automotive aerodynamics0
Neural Network Surrogate and Projected Gradient Descent for Fast and Reliable Finite Element Model Calibration: a Case Study on an Intervertebral DiscCode0
Full-range Head Pose Geometric Data Augmentations0
RAGEval: Scenario Specific RAG Evaluation Dataset Generation FrameworkCode3
On Deep Learning for computing the Dynamic Initial Margin and Margin Value Adjustment0
AutoVCoder: A Systematic Framework for Automated Verilog Code Generation using LLMs0
Realistic Surgical Image Dataset Generation Based On 3D Gaussian Splatting0
LSD3K: A Benchmark for Smoke Removal from Laparoscopic Surgery Images0
Close the Sim2real Gap via Physically-based Structured Light Synthetic Data Simulation0
Chip Placement with Diffusion ModelsCode1
DataDream: Few-shot Guided Dataset GenerationCode2
JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation0
Rethinking the Effectiveness of Graph Classification Datasets in Benchmarks for Assessing GNNsCode0
Synthetic data: How could it be used for infectious disease research?0
Semantically Rich Local Dataset Generation for Explainable AI in GenomicsCode0
TheoremLlama: Transforming General-Purpose LLMs into Lean4 ExpertsCode1
Channel Modeling Aided Dataset Generation for AI-Enabled CSI Feedback: Advances, Challenges, and Solutions0
UniGen: A Unified Framework for Textual Dataset Generation Using Large Language ModelsCode2
DataFreeShield: Defending Adversarial Attacks without Training Data0
SEC-QA: A Systematic Evaluation Corpus for Financial QA0
GECOBench: A Gender-Controlled Text Dataset and Benchmark for Quantifying Biases in ExplanationsCode0
Prompts as Auto-Optimized Training Hyperparameters: Training Best-in-Class IR Models from Scratch with 10 Gold Labels0
Federated Learning-based Collaborative Wideband Spectrum Sensing and Scheduling for UAVs in UTM Systems0
Cephalo: Multi-Modal Vision-Language Models for Bio-Inspired Materials Analysis and DesignCode1
Are You Sure? Rank Them Again: Repeated Ranking For Better Preference Datasets0
AutoCoder: Enhancing Code Large Language Model with AIEV-InstructCode4
Automated Multi-level Preference for MLLMsCode1
SynDy: Synthetic Dynamic Dataset Generation Framework for Misinformation Tasks0
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation0
Stable Diffusion Dataset Generation for Downstream Classification Tasks0
UniGen: Universal Domain Generalization for Sentiment Classification via Zero-shot Dataset GenerationCode0
Generative Dataset Distillation: Balancing Global Structure and Local DetailsCode0
Better Synthetic Data by Retrieving and Transforming Existing DatasetsCode7
Forcing Diffuse Distributions out of Language ModelsCode1
Learning Wireless Data Knowledge Graph for Green Intelligent Communications: Methodology and Experiments0
Multi-fingered Robotic Hand Grasping in Cluttered Environments through Hand-object Contact Semantic Mapping0
Generating Synthetic Ground Truth Distributions for Multi-step Trajectory Prediction using Probabilistic Composite Bézier Curves0
Injecting New Knowledge into Large Language Models via Supervised Fine-Tuning0
JAX-SPH: A Differentiable Smoothed Particle Hydrodynamics FrameworkCode2
Direct Alignment of Draft Model for Speculative Decoding with Chat-Fine-Tuned LLMs0
Block and Detail: Scaffolding Sketch-to-Image Generation0
NToP: NeRF-Powered Large-scale Dataset Generation for 2D and 3D Human Pose Estimation in Top-View Fisheye ImagesCode0
An Automated End-to-End Open-Source Software for High-Quality Text-to-Speech Dataset GenerationCode2
Clustering in Dynamic Environments: A Framework for Benchmark Dataset Generation With Heterogeneous ChangesCode0
Show:102550
← PrevPage 3 of 7Next →

No leaderboard results yet.