SOTAVerified

Dataset Generation

The task involves enhancing the training of target application (e.g. autonomous driving systems) by generating datasets of diverse and critical elements (e.g. traffic scenarios). Traditional methods rely on expensive and limited datasets, which often fail to capture rare but essential situations that can pose risks during testing.

Papers

Showing 101150 of 308 papers

TitleStatusHype
Reasoning Inconsistencies and How to Mitigate Them in Deep Learning0
Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation0
Neuro-Symbolic Scene Graph Conditioning for Synthetic Image Dataset Generation0
Automating 3D Dataset Generation with Neural Radiance FieldsCode0
Transport-Related Surface Detection with Machine Learning: Analyzing Temporal Trends in Madrid and ViennaCode0
Unveiling the Invisible: Reasoning Complex Occlusions Amodally with AURA0
ValuePilot: A Two-Phase Framework for Value-Driven Decision-Making0
Dynamic-KGQA: A Scalable Framework for Generating Adaptive Question Answering Datasets0
Undertrained Image Reconstruction for Realistic Degradation in Blind Image Super-Resolution0
Technical report of a DMD-based Characterization Method for Vision Sensors0
WebFAQ: A Multilingual Collection of Natural Q&A Datasets for Dense Retrieval0
Holistic Audit Dataset Generation for LLM Unlearning via Knowledge Graph Traversal and Redundancy Removal0
SpecDM: Hyperspectral Dataset Synthesis with Pixel-level Semantic Annotations0
Beyond Translation: LLM-Based Data Generation for Multilingual Fact-CheckingCode0
Synth It Like KITTI: Synthetic Data Generation for Object Detection in Driving ScenariosCode0
TreeCut: A Synthetic Unanswerable Math Word Problem Dataset for LLM Hallucination EvaluationCode0
One-Shot Federated Learning with Classifier-Free Diffusion Models0
Behavioral Entropy-Guided Dataset Generation for Offline Reinforcement Learning0
MultiFloodSynth: Multi-Annotated Flood Synthetic Dataset Generation0
Synthetic User Behavior Sequence Generation with Large Language Models for Smart Homes0
iTRI-QA: a Toolset for Customized Question-Answer Dataset Generation Using Language Models for Enhanced Scientific Research0
E-Gen: Leveraging E-Graphs to Improve Continuous Representations of Symbolic ExpressionsCode0
Measuring and Mitigating Hallucinations in Vision-Language Dataset Generation for Remote Sensing0
A Dataset Generation Toolbox for Dynamic Security Assessment: On the Role of the Security BoundaryCode0
The Invisible Hand: Unveiling Provider Bias in Large Language Models for Code Generation0
Neural Error Covariance Estimation for Precise LiDAR Localization0
DynScene: Scalable Generation of Dynamic Robotic Manipulation Scenes for Embodied AI0
Low-Biased General Annotated Dataset Generation0
Movie2Story: A framework for understanding videos and telling stories in the form of novel text0
Cognition Chain for Explainable Psychological Stress Detection on Social MediaCode0
SciFaultyQA: Benchmarking LLMs on Faulty Science Question Detection with a GAN-Inspired Approach to Synthetic Dataset GenerationCode0
Unbiased General Annotated Dataset Generation0
JAPAGEN: Efficient Few/Zero-shot Learning via Japanese Training Dataset Generation with LLMCode0
VariFace: Fair and Diverse Synthetic Dataset Generation for Face Recognition0
SynFinTabs: A Dataset of Synthetic Financial Tables for Information and Table ExtractionCode0
An Evolutionary Large Language Model for Hallucination Mitigation0
Know Your RAG: Dataset Taxonomy and Generation Strategies for Evaluating RAG Systems0
Drone Detection using Deep Neural Networks Trained on Pure Synthetic DataCode0
CorrSynth -- A Correlated Sampling Method for Diverse Dataset Generation from LLMs0
HyperFace: Generating Synthetic Face Recognition Datasets by Exploring Face Embedding Hypersphere0
Fineweb-Edu-Ar: Machine-translated Corpus to Support Arabic Small Language Models0
Fairness-Utilization Trade-off in Wireless Networks with Explainable Kolmogorov-Arnold Networks0
Simulating User Agents for Embodied Conversational-AI0
SYNOSIS: Image synthesis pipeline for machine vision in metal surface inspection0
Pseudo Dataset Generation for Out-of-Domain Multi-Camera View Recommendation0
FTSmartAudit: A Knowledge Distillation-Enhanced Framework for Automated Smart Contract Auditing Using Fine-Tuned LLMs0
Anchored Alignment for Self-Explanations Enhancement0
Autonomous Self-Trained Channel State Prediction Method for mmWave Vehicular Communications0
HealthQ: Unveiling Questioning Capabilities of LLM Chains in Healthcare Conversations0
EarthquakeNPP: Benchmark Datasets for Earthquake Forecasting with Neural Point Processes0
Show:102550
← PrevPage 3 of 7Next →

No leaderboard results yet.