SOTAVerified

Dataset Generation

The task involves enhancing the training of target application (e.g. autonomous driving systems) by generating datasets of diverse and critical elements (e.g. traffic scenarios). Traditional methods rely on expensive and limited datasets, which often fail to capture rare but essential situations that can pose risks during testing.

Papers

Showing 51100 of 308 papers

TitleStatusHype
DCFace: Synthetic Face Generation with Dual Condition Diffusion ModelCode1
CamDiff: Camouflage Image Augmentation via Diffusion ModelCode1
LIQUID: A Framework for List Question Answering Dataset GenerationCode1
ProGen: Progressive Zero-shot Dataset Generation via In-context FeedbackCode1
Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to Parcel LogisticsCode1
RealFlow: EM-based Realistic Optical Flow Dataset Generation from VideosCode1
HM3D-ABO: A Photo-realistic Dataset for Object-centric Multi-view 3D ReconstructionCode1
Learning to Answer Visual Questions from Web VideosCode1
ZeroGen: Efficient Zero-shot Learning via Dataset GenerationCode1
Detecting Anti-Vaccine Users on TwitterCode1
SynPick: A Dataset for Dynamic Bin Picking Scene UnderstandingCode1
SofaMyRoom: a fast and multiplatform "shoebox" room simulator for binaural room impulse response dataset generationCode1
Improving Paraphrase Detection with the Adversarial Paraphrasing TaskCode1
Perceptual Loss for Robust Unsupervised Homography EstimationCode1
Monocular Multi-Layer Layout Estimation for Warehouse RacksCode1
MK-SQuIT: Synthesizing Questions using Iterative Template-fillingCode1
Actionet: An Interactive End-To-End Platform For Task-Based Data Collection And Augmentation In 3D EnvironmentCode1
Afro-MNIST: Synthetic generation of MNIST-style datasets for low-resource languagesCode1
Image Generation for Efficient Neural Network Training in Autonomous Drone RacingCode1
ViWi Vision-Aided mmWave Beam Tracking: Dataset, Task, and Baseline SolutionsCode1
Communicating Smartly in the Molecular Domain: Neural Networks in the Internet of Bio-Nano ThingsCode0
From General to Targeted Rewards: Surpassing GPT-4 in Open-Ended Long-Context Generation0
A large-scale, physically-based synthetic dataset for satellite pose estimation0
Enhancing Clinical Models with Pseudo Data for De-identificationCode0
Code Execution as Grounded Supervision for LLM ReasoningCode0
Generating Synthetic Stereo Datasets using 3D Gaussian Splatting and Expert Knowledge Transfer0
Synthetic Dataset Generation for Autonomous Mobile Robots Using 3D Gaussian Splatting for Vision Training0
CETBench: A Novel Dataset constructed via Transformations over Programs for Benchmarking LLMs for Code-Equivalence Checking0
Multi-Domain ABSA Conversation Dataset Generation via LLMs for Real-World Evaluation and Model Comparison0
F-ANcGAN: An Attention-Enhanced Cycle Consistent Generative Adversarial Architecture for Synthetic Image Generation of NanoparticlesCode0
Track Anything Annotate: Video annotation and dataset generation of computer vision modelsCode0
Robo2VLM: Visual Question Answering from Large-Scale In-the-Wild Robot Manipulation Datasets0
seg_3D_by_PC2D: Multi-View Projection for Domain Generalization and Adaptation in 3D Semantic SegmentationCode0
FMSD-TTS: Few-shot Multi-Speaker Multi-Dialect Text-to-Speech Synthesis for Ü-Tsang, Amdo and Kham Speech Dataset Generation0
Automatic Dataset Generation for Knowledge Intensive Question Answering Tasks0
Event-based Star Tracking under Spacecraft Jitter: the e-STURT Dataset0
Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents0
Contextual Paralinguistic Data Creation for Multi-Modal Speech-LLM: Data Condensation and Spoken QA Generation0
LoFT: LoRA-fused Training Dataset Generation with Few-shot GuidanceCode0
Noisemaker 3D: Comprehensive Framework for Mesh Noise GenerationCode0
IrrMap: A Large-Scale Comprehensive Dataset for Irrigation Method MappingCode0
Guidance for Intra-cardiac Echocardiography Manipulation to Maintain Continuous Therapy Device Tip Visibility0
Estimating Commonsense Scene Composition on Belief Scene Graphs0
Document Retrieval Augmented Fine-Tuning (DRAFT) for safety-critical software assessments0
TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language ModelsCode0
V^2R-Bench: Holistically Evaluating LVLM Robustness to Fundamental Visual Variations0
MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World0
Unreal Robotics Lab: A High-Fidelity Robotics Simulator with Advanced Physics and Rendering0
Geometric Generality of Transformer-Based Gröbner Basis ComputationCode0
DeepWheel: Generating a 3D Synthetic Wheel Dataset for Design and Performance Evaluation0
Show:102550
← PrevPage 2 of 7Next →

No leaderboard results yet.