Better Synthetic Data by Retrieving and Transforming Existing Datasets Apr 22, 2024 Dataset Generation Diversity
Code Code Available 7Synthetic Dataset Generation for Adversarial Machine Learning Research Jul 21, 2022 BIG-bench Machine Learning Dataset Generation
Code Code Available 6AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct May 23, 2024 Class-level Code Generation Code Completion
Code Code Available 4Prompt2Model: Generating Deployable Models from Natural Language Instructions Aug 23, 2023 Data-free Knowledge Distillation Dataset Generation
Code Code Available 4Hierarchical Lexical Graph for Enhanced Multi-Hop Retrieval Jun 9, 2025 Dataset Generation RAG
Code Code Available 3RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework Aug 2, 2024 Benchmarking Dataset Generation
Code Code Available 3Vision Language Action Models in Robotic Manipulation: A Systematic Review Jul 14, 2025 Dataset Generation Natural Language Understanding
Code Code Available 2CellViT++: Energy-Efficient and Adaptive Cell Segmentation and Classification Using Foundation Models Jan 9, 2025 Cell Segmentation Dataset Generation
Code Code Available 2Physics Informed Distillation for Diffusion Models Nov 13, 2024 Dataset Generation Image Generation
Code Code Available 2DataDream: Few-shot Guided Dataset Generation Jul 15, 2024 Classification Dataset Generation
Code Code Available 2UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models Jun 27, 2024 Attribute Benchmarking
Code Code Available 2JAX-SPH: A Differentiable Smoothed Particle Hydrodynamics Framework Mar 7, 2024 Dataset Generation
Code Code Available 2An Automated End-to-End Open-Source Software for High-Quality Text-to-Speech Dataset Generation Feb 26, 2024 Dataset Generation text-to-speech
Code Code Available 2MultiCorrupt: A Multi-Modal Robustness Dataset and Benchmark of LiDAR-Camera Fusion for 3D Object Detection Feb 18, 2024 3D Object Detection Dataset Generation
Code Code Available 2DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models Aug 11, 2023 Dataset Generation Decoder
Code Code Available 2Real-Time Per-Garment Virtual Try-On with Temporal Consistency for Loose-Fitting Garments Jun 14, 2025 Dataset Generation Virtual Try-on
Code Code Available 1MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning Jun 5, 2025 Dataset Generation Mathematical Problem-Solving
Code Code Available 1TimeGraph: Synthetic Benchmark Datasets for Robust Time-Series Causal Discovery Jun 2, 2025 Causal Discovery Dataset Generation
Code Code Available 1Tiny QA Benchmark++: Ultra-Lightweight, Synthetic Multilingual Dataset Generation & Smoke-Tests for Continuous LLM Evaluation May 17, 2025 Dataset Generation GPU
Code Code Available 1Bounding Box-Guided Diffusion for Synthesizing Industrial Images and Segmentation Map May 6, 2025 Dataset Generation Segmentation
Code Code Available 1ColabSfM: Collaborative Structure-from-Motion by Point Cloud Registration Mar 21, 2025 Dataset Generation Point Cloud Registration
Code Code Available 1Oasis: One Image is All You Need for Multimodal Instruction Data Synthesis Mar 11, 2025 All Dataset Generation
Code Code Available 1CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language Models Jan 2, 2025 Benchmarking Computer Security
Code Code Available 1ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation Dec 24, 2024 Dataset Generation
Code Code Available 1Generating Traffic Scenarios via In-Context Learning to Learn Better Motion Planner Dec 24, 2024 Autonomous Driving Dataset Generation
Code Code Available 1SimuScope: Realistic Endoscopic Synthetic Dataset Generation through Surgical Simulation and Diffusion Models Dec 3, 2024 Dataset Generation Image-to-Image Translation
Code Code Available 1Global Tensor Motion Planning Nov 28, 2024 Dataset Generation Diversity
Code Code Available 1OpenLS-DGF: An Adaptive Open-Source Dataset Generation Framework for Machine Learning Tasks in Logic Synthesis Nov 14, 2024 Dataset Generation
Code Code Available 1CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation Sep 3, 2024 Dataset Generation Question Answering
Code Code Available 1PADetBench: Towards Benchmarking Physical Attacks against Object Detection Aug 17, 2024 Adversarial Robustness Benchmarking
Code Code Available 1Chip Placement with Diffusion Models Jul 17, 2024 Dataset Generation Denoising
Code Code Available 1TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts Jul 3, 2024 Automated Theorem Proving Code Generation
Code Code Available 1Cephalo: Multi-Modal Vision-Language Models for Bio-Inspired Materials Analysis and Design May 29, 2024 Dataset Generation Image to text
Code Code Available 1Automated Multi-level Preference for MLLMs May 18, 2024 Dataset Generation Hallucination
Code Code Available 1Forcing Diffuse Distributions out of Language Models Apr 16, 2024 Dataset Generation Diversity
Code Code Available 1UAV-Rain1k: A Benchmark for Raindrop Removal from UAV Aerial Imagery Feb 8, 2024 Dataset Generation Raindrop Removal
Code Code Available 1PEGASUS: Physically Enhanced Gaussian Splatting Simulation System for 6DoF Object Pose Dataset Generation Jan 4, 2024 Dataset Generation Object
Code Code Available 1Generalizing Single-View 3D Shape Retrieval to Occlusions and Unseen Objects Dec 31, 2023 3D Shape Retrieval Dataset Generation
Code Code Available 1Faithful Persona-based Conversational Dataset Generation with Large Language Models Dec 15, 2023 Chatbot Dataset Generation
Code Code Available 1LLMaAA: Making Large Language Models as Active Annotators Oct 30, 2023 Active Learning Dataset Generation
Code Code Available 1Dataset Diffusion: Diffusion-based Synthetic Dataset Generation for Pixel-Level Semantic Segmentation Sep 25, 2023 Dataset Generation Segmentation
Code Code Available 1Fabricator: An Open Source Toolkit for Generating Labeled Training Data with Teacher LLMs Sep 18, 2023 Dataset Generation Question Answering
Code Code Available 1Learning-based NLOS Detection and Uncertainty Prediction of GNSS Observations with Transformer-Enhanced LSTM Network Sep 1, 2023 Dataset Generation State Estimation
Code Code Available 1DiffuGen: Adaptable Approach for Generating Labeled Image Datasets using Stable Diffusion Models Sep 1, 2023 Dataset Generation Image Generation
Code Code Available 1Developing a Scalable Benchmark for Assessing Large Language Models in Knowledge Graph Engineering Aug 31, 2023 Benchmarking Dataset Generation
Code Code Available 1Supervised Homography Learning with Realistic Dataset Generation Jul 28, 2023 Dataset Generation
Code Code Available 1SynTable: A Synthetic Data Generation Pipeline for Unseen Object Amodal Instance Segmentation of Cluttered Tabletop Scenes Jul 14, 2023 Amodal Instance Segmentation Dataset Generation
Code Code Available 1NeuroGraph: Benchmarks for Graph Machine Learning in Brain Connectomics Jun 9, 2023 Benchmarking Dataset Generation
Code Code Available 1Sim-Suction: Learning a Suction Grasp Policy for Cluttered Environments Using a Synthetic Benchmark May 25, 2023 Dataset Generation Physical Simulations
Code Code Available 1Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks May 23, 2023 Attribute Dataset Generation
Code Code Available 1