Better Synthetic Data by Retrieving and Transforming Existing Datasets Apr 22, 2024 Dataset Generation Diversity
Code Code Available 75 Synthetic Dataset Generation for Adversarial Machine Learning Research Jul 21, 2022 BIG-bench Machine Learning Dataset Generation
Code Code Available 65 Prompt2Model: Generating Deployable Models from Natural Language Instructions Aug 23, 2023 Data-free Knowledge Distillation Dataset Generation
Code Code Available 45 AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct May 23, 2024 Class-level Code Generation Code Completion
Code Code Available 45 RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework Aug 2, 2024 Benchmarking Dataset Generation
Code Code Available 35 Hierarchical Lexical Graph for Enhanced Multi-Hop Retrieval Jun 9, 2025 Dataset Generation RAG
Code Code Available 35 DataDream: Few-shot Guided Dataset Generation Jul 15, 2024 Classification Dataset Generation
Code Code Available 25 UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models Jun 27, 2024 Attribute Benchmarking
Code Code Available 25 Vision Language Action Models in Robotic Manipulation: A Systematic Review Jul 14, 2025 Dataset Generation Natural Language Understanding
Code Code Available 25 An Automated End-to-End Open-Source Software for High-Quality Text-to-Speech Dataset Generation Feb 26, 2024 Dataset Generation text-to-speech
Code Code Available 25 DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models Aug 11, 2023 Dataset Generation Decoder
Code Code Available 25 Physics Informed Distillation for Diffusion Models Nov 13, 2024 Dataset Generation Image Generation
Code Code Available 25 CellViT++: Energy-Efficient and Adaptive Cell Segmentation and Classification Using Foundation Models Jan 9, 2025 Cell Segmentation Dataset Generation
Code Code Available 25 JAX-SPH: A Differentiable Smoothed Particle Hydrodynamics Framework Mar 7, 2024 Dataset Generation
Code Code Available 25 MultiCorrupt: A Multi-Modal Robustness Dataset and Benchmark of LiDAR-Camera Fusion for 3D Object Detection Feb 18, 2024 3D Object Detection Dataset Generation
Code Code Available 25 PEGASUS: Physically Enhanced Gaussian Splatting Simulation System for 6DoF Object Pose Dataset Generation Jan 4, 2024 Dataset Generation Object
Code Code Available 15 NeuroGraph: Benchmarks for Graph Machine Learning in Brain Connectomics Jun 9, 2023 Benchmarking Dataset Generation
Code Code Available 15 Perceptual Loss for Robust Unsupervised Homography Estimation Apr 20, 2021 Dataset Generation Homography Estimation
Code Code Available 15 MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning Jun 5, 2025 Dataset Generation Mathematical Problem-Solving
Code Code Available 15 LLMaAA: Making Large Language Models as Active Annotators Oct 30, 2023 Active Learning Dataset Generation
Code Code Available 15 DCFace: Synthetic Face Generation with Dual Condition Diffusion Model Apr 14, 2023 Dataset Generation Face Generation
Code Code Available 15 MK-SQuIT: Synthesizing Questions using Iterative Template-filling Nov 4, 2020 Dataset Generation Machine Translation
Code Code Available 15 CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation Sep 3, 2024 Dataset Generation Question Answering
Code Code Available 15 Detecting Anti-Vaccine Users on Twitter Oct 21, 2021 Dataset Generation Misinformation
Code Code Available 15 DiffuGen: Adaptable Approach for Generating Labeled Image Datasets using Stable Diffusion Models Sep 1, 2023 Dataset Generation Image Generation
Code Code Available 15 Developing a Scalable Benchmark for Assessing Large Language Models in Knowledge Graph Engineering Aug 31, 2023 Benchmarking Dataset Generation
Code Code Available 15 Afro-MNIST: Synthetic generation of MNIST-style datasets for low-resource languages Sep 28, 2020 BIG-bench Machine Learning Dataset Generation
Code Code Available 15 Fabricator: An Open Source Toolkit for Generating Labeled Training Data with Teacher LLMs Sep 18, 2023 Dataset Generation Question Answering
Code Code Available 15 Oasis: One Image is All You Need for Multimodal Instruction Data Synthesis Mar 11, 2025 All Dataset Generation
Code Code Available 15 PADetBench: Towards Benchmarking Physical Attacks against Object Detection Aug 17, 2024 Adversarial Robustness Benchmarking
Code Code Available 15 Dataset Diffusion: Diffusion-based Synthetic Dataset Generation for Pixel-Level Semantic Segmentation Sep 25, 2023 Dataset Generation Segmentation
Code Code Available 15 Learning-based NLOS Detection and Uncertainty Prediction of GNSS Observations with Transformer-Enhanced LSTM Network Sep 1, 2023 Dataset Generation State Estimation
Code Code Available 15 ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation Dec 24, 2024 Dataset Generation
Code Code Available 15 Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks May 23, 2023 Attribute Dataset Generation
Code Code Available 15 Image Generation for Efficient Neural Network Training in Autonomous Drone Racing Aug 6, 2020 Dataset Generation Efficient Neural Network
Code Code Available 15 Learning to Answer Visual Questions from Web Videos May 10, 2022 Dataset Generation Question Answering
Code Code Available 15 Cephalo: Multi-Modal Vision-Language Models for Bio-Inspired Materials Analysis and Design May 29, 2024 Dataset Generation Image to text
Code Code Available 15 Chip Placement with Diffusion Models Jul 17, 2024 Dataset Generation Denoising
Code Code Available 15 ColabSfM: Collaborative Structure-from-Motion by Point Cloud Registration Mar 21, 2025 Dataset Generation Point Cloud Registration
Code Code Available 15 Automated Multi-level Preference for MLLMs May 18, 2024 Dataset Generation Hallucination
Code Code Available 15 CamDiff: Camouflage Image Augmentation via Diffusion Model Apr 11, 2023 Dataset Generation Image Augmentation
Code Code Available 15 HM3D-ABO: A Photo-realistic Dataset for Object-centric Multi-view 3D Reconstruction Jun 24, 2022 3D Reconstruction Camera Pose Estimation
Code Code Available 15 Generating Traffic Scenarios via In-Context Learning to Learn Better Motion Planner Dec 24, 2024 Autonomous Driving Dataset Generation
Code Code Available 15 Improving Paraphrase Detection with the Adversarial Paraphrasing Task Jun 14, 2021 Dataset Generation Paraphrase Identification
Code Code Available 15 Bounding Box-Guided Diffusion for Synthesizing Industrial Images and Segmentation Map May 6, 2025 Dataset Generation Segmentation
Code Code Available 15 CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language Models Jan 2, 2025 Benchmarking Computer Security
Code Code Available 15 Actionet: An Interactive End-To-End Platform For Task-Based Data Collection And Augmentation In 3D Environment Oct 3, 2020 Dataset Generation Task Planning
Code Code Available 15 Forcing Diffuse Distributions out of Language Models Apr 16, 2024 Dataset Generation Diversity
Code Code Available 15 OpenLS-DGF: An Adaptive Open-Source Dataset Generation Framework for Machine Learning Tasks in Logic Synthesis Nov 14, 2024 Dataset Generation
Code Code Available 15 Faithful Persona-based Conversational Dataset Generation with Large Language Models Dec 15, 2023 Chatbot Dataset Generation
Code Code Available 15