SOTAVerified

Dataset Generation

The task involves enhancing the training of target application (e.g. autonomous driving systems) by generating datasets of diverse and critical elements (e.g. traffic scenarios). Traditional methods rely on expensive and limited datasets, which often fail to capture rare but essential situations that can pose risks during testing.

Papers

Showing 150 of 308 papers

TitleStatusHype
Vision Language Action Models in Robotic Manipulation: A Systematic ReviewCode2
Communicating Smartly in the Molecular Domain: Neural Networks in the Internet of Bio-Nano ThingsCode0
From General to Targeted Rewards: Surpassing GPT-4 in Open-Ended Long-Context Generation0
A large-scale, physically-based synthetic dataset for satellite pose estimation0
Enhancing Clinical Models with Pseudo Data for De-identificationCode0
Real-Time Per-Garment Virtual Try-On with Temporal Consistency for Loose-Fitting GarmentsCode1
Code Execution as Grounded Supervision for LLM ReasoningCode0
Hierarchical Lexical Graph for Enhanced Multi-Hop RetrievalCode3
MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal ReasoningCode1
Generating Synthetic Stereo Datasets using 3D Gaussian Splatting and Expert Knowledge Transfer0
Synthetic Dataset Generation for Autonomous Mobile Robots Using 3D Gaussian Splatting for Vision Training0
CETBench: A Novel Dataset constructed via Transformations over Programs for Benchmarking LLMs for Code-Equivalence Checking0
TimeGraph: Synthetic Benchmark Datasets for Robust Time-Series Causal DiscoveryCode1
Multi-Domain ABSA Conversation Dataset Generation via LLMs for Real-World Evaluation and Model Comparison0
F-ANcGAN: An Attention-Enhanced Cycle Consistent Generative Adversarial Architecture for Synthetic Image Generation of NanoparticlesCode0
Track Anything Annotate: Video annotation and dataset generation of computer vision modelsCode0
Robo2VLM: Visual Question Answering from Large-Scale In-the-Wild Robot Manipulation Datasets0
seg_3D_by_PC2D: Multi-View Projection for Domain Generalization and Adaptation in 3D Semantic SegmentationCode0
Automatic Dataset Generation for Knowledge Intensive Question Answering Tasks0
FMSD-TTS: Few-shot Multi-Speaker Multi-Dialect Text-to-Speech Synthesis for Ü-Tsang, Amdo and Kham Speech Dataset Generation0
Event-based Star Tracking under Spacecraft Jitter: the e-STURT Dataset0
Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents0
Contextual Paralinguistic Data Creation for Multi-Modal Speech-LLM: Data Condensation and Spoken QA Generation0
Tiny QA Benchmark++: Ultra-Lightweight, Synthetic Multilingual Dataset Generation & Smoke-Tests for Continuous LLM EvaluationCode1
Noisemaker 3D: Comprehensive Framework for Mesh Noise GenerationCode0
LoFT: LoRA-fused Training Dataset Generation with Few-shot GuidanceCode0
IrrMap: A Large-Scale Comprehensive Dataset for Irrigation Method MappingCode0
Guidance for Intra-cardiac Echocardiography Manipulation to Maintain Continuous Therapy Device Tip Visibility0
Bounding Box-Guided Diffusion for Synthesizing Industrial Images and Segmentation MapCode1
Estimating Commonsense Scene Composition on Belief Scene Graphs0
Document Retrieval Augmented Fine-Tuning (DRAFT) for safety-critical software assessments0
TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language ModelsCode0
V^2R-Bench: Holistically Evaluating LVLM Robustness to Fundamental Visual Variations0
MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World0
Unreal Robotics Lab: A High-Fidelity Robotics Simulator with Advanced Physics and Rendering0
Geometric Generality of Transformer-Based Gröbner Basis ComputationCode0
DeepWheel: Generating a 3D Synthetic Wheel Dataset for Design and Performance Evaluation0
Reasoning Inconsistencies and How to Mitigate Them in Deep Learning0
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation0
ColabSfM: Collaborative Structure-from-Motion by Point Cloud RegistrationCode1
Neuro-Symbolic Scene Graph Conditioning for Synthetic Image Dataset Generation0
Automating 3D Dataset Generation with Neural Radiance FieldsCode0
Transport-Related Surface Detection with Machine Learning: Analyzing Temporal Trends in Madrid and ViennaCode0
Unveiling the Invisible: Reasoning Complex Occlusions Amodally with AURA0
Oasis: One Image is All You Need for Multimodal Instruction Data SynthesisCode1
Dynamic-KGQA: A Scalable Framework for Generating Adaptive Question Answering Datasets0
ValuePilot: A Two-Phase Framework for Value-Driven Decision-Making0
Technical report of a DMD-based Characterization Method for Vision Sensors0
Undertrained Image Reconstruction for Realistic Degradation in Blind Image Super-Resolution0
WebFAQ: A Multilingual Collection of Natural Q&A Datasets for Dense Retrieval0
Show:102550
← PrevPage 1 of 7Next →

No leaderboard results yet.