SOTAVerified

Dataset Generation

The task involves enhancing the training of target application (e.g. autonomous driving systems) by generating datasets of diverse and critical elements (e.g. traffic scenarios). Traditional methods rely on expensive and limited datasets, which often fail to capture rare but essential situations that can pose risks during testing.

Papers

Showing 76100 of 308 papers

TitleStatusHype
VariFace: Fair and Diverse Synthetic Dataset Generation for Face Recognition0
JAPAGEN: Efficient Few/Zero-shot Learning via Japanese Training Dataset Generation with LLMCode0
SynFinTabs: A Dataset of Synthetic Financial Tables for Information and Table ExtractionCode0
An Evolutionary Large Language Model for Hallucination Mitigation0
SimuScope: Realistic Endoscopic Synthetic Dataset Generation through Surgical Simulation and Diffusion ModelsCode1
Know Your RAG: Dataset Taxonomy and Generation Strategies for Evaluating RAG Systems0
Global Tensor Motion PlanningCode1
OpenLS-DGF: An Adaptive Open-Source Dataset Generation Framework for Machine Learning Tasks in Logic SynthesisCode1
HyperFace: Generating Synthetic Face Recognition Datasets by Exploring Face Embedding Hypersphere0
Drone Detection using Deep Neural Networks Trained on Pure Synthetic DataCode0
Physics Informed Distillation for Diffusion ModelsCode2
CorrSynth -- A Correlated Sampling Method for Diverse Dataset Generation from LLMs0
Fineweb-Edu-Ar: Machine-translated Corpus to Support Arabic Small Language Models0
Fairness-Utilization Trade-off in Wireless Networks with Explainable Kolmogorov-Arnold Networks0
Simulating User Agents for Embodied Conversational-AI0
SYNOSIS: Image synthesis pipeline for machine vision in metal surface inspection0
FTSmartAudit: A Knowledge Distillation-Enhanced Framework for Automated Smart Contract Auditing Using Fine-Tuned LLMs0
Pseudo Dataset Generation for Out-of-Domain Multi-Camera View Recommendation0
Anchored Alignment for Self-Explanations Enhancement0
Autonomous Self-Trained Channel State Prediction Method for mmWave Vehicular Communications0
HealthQ: Unveiling Questioning Capabilities of LLM Chains in Healthcare Conversations0
EarthquakeNPP: Benchmark Datasets for Earthquake Forecasting with Neural Point Processes0
Towards Synthetic Data Generation for Improved Pain Recognition in Videos under Patient ConstraintsCode0
Harnessing LLMs for API Interactions: A Framework for Classification and Synthetic Data Generation0
Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors0
Show:102550
← PrevPage 4 of 13Next →

No leaderboard results yet.