| Improved Training of Wasserstein GANs | Mar 31, 2017 | Conditional Image GenerationImage Generation | CodeCode Available | 1 |
| Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models | Apr 23, 2024 | Conversational Question AnsweringDialogue State Tracking | CodeCode Available | 1 |
| BhasaAnuvaad: A Speech Translation Dataset for 13 Indian Languages | Nov 7, 2024 | automatic-speech-translationSynthetic Data Generation | CodeCode Available | 1 |
| AnthroNet: Conditional Generation of Humans via Anthropometrics | Sep 7, 2023 | 3D human pose and shape estimation3D Human Reconstruction | CodeCode Available | 1 |
| MarkushGrapher: Joint Visual and Textual Recognition of Markush Structures | Mar 20, 2025 | Synthetic Data Generation | CodeCode Available | 1 |
| Black-Box Attacks on Sequential Recommenders via Data-Free Model Extraction | Sep 1, 2021 | Data PoisoningKnowledge Distillation | CodeCode Available | 1 |
| FinDiff: Diffusion Models for Financial Tabular Data Generation | Sep 4, 2023 | Fraud DetectionSynthetic Data Generation | CodeCode Available | 1 |
| Data-Free Knowledge Distillation via Feature Exchange and Activation Region Constraint | Jan 1, 2023 | Data AugmentationData-free Knowledge Distillation | CodeCode Available | 1 |
| FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data | Jan 28, 2025 | Natural Language InferenceSynthetic Data Generation | CodeCode Available | 1 |
| SYNC: A Copula based Framework for Generating Synthetic Data from Aggregated Sources | Sep 20, 2020 | Feature EngineeringSynthetic Data Generation | CodeCode Available | 1 |
| GECTurk: Grammatical Error Correction and Detection Dataset for Turkish | Sep 20, 2023 | ArticlesDecoder | CodeCode Available | 1 |
| Synthetic Data-based Detection of Zebras in Drone Imagery | Apr 30, 2023 | Missing LabelsPose Estimation | CodeCode Available | 1 |
| Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information Extraction | Mar 7, 2023 | Synthetic Data Generation | CodeCode Available | 1 |
| Synthetic Data Generation Framework, Dataset, and Efficient Deep Model for Pedestrian Intention Prediction | Jan 12, 2024 | Autonomous DrivingPrediction | CodeCode Available | 1 |
| DeepNAG: Deep Non-Adversarial Gesture Generation | Nov 18, 2020 | Data AugmentationDynamic Time Warping | CodeCode Available | 1 |
| Tabular Transformers for Modeling Multivariate Time Series | Nov 3, 2020 | Fraud DetectionSynthetic Data Generation | CodeCode Available | 1 |
| RAGSynth: Synthetic Data for Robust and Faithful RAG Component Optimization | May 16, 2025 | RAGSynthetic Data Generation | CodeCode Available | 1 |
| Towards Ball Spin and Trajectory Analysis in Table Tennis Broadcast Videos via Physically Grounded Synthetic-to-Real Transfer | Apr 28, 2025 | Monocular 3D Object LocalizationSports Analytics | CodeCode Available | 1 |
| EPIC: Effective Prompting for Imbalanced-Class Data Synthesis in Tabular Data Classification via Large Language Models | Apr 15, 2024 | In-Context LearningSynthetic Data Generation | CodeCode Available | 1 |
| AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data | Mar 7, 2025 | DiversityFairness | CodeCode Available | 1 |
| Generalizing electrocardiogram delineation -- Training convolutional neural networks with synthetic data augmentation | Nov 25, 2021 | Data AugmentationRhythm | CodeCode Available | 1 |
| DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails | Feb 7, 2025 | Reinforcement Learning (RL)Synthetic Data Generation | CodeCode Available | 1 |
| Diffusion-based Conditional ECG Generation with Structured State Space Models | Jan 19, 2023 | State Space ModelsSynthetic Data Generation | CodeCode Available | 1 |
| EC-GAN: Low-Sample Classification using Semi-Supervised Algorithms and GANs | Dec 26, 2020 | ClassificationData Augmentation | CodeCode Available | 1 |
| UnrealROX+: An Improved Tool for Acquiring Synthetic Data from Virtual 3D Environments | Apr 23, 2021 | Depth Estimationobject-detection | CodeCode Available | 1 |
| Using matrix-product states for time-series machine learning | Dec 20, 2024 | AstronomyImputation | CodeCode Available | 1 |
| Characterization and Greedy Learning of Gaussian Structural Causal Models under Unknown Interventions | Nov 27, 2022 | Synthetic Data Generation | CodeCode Available | 1 |
| DFNet: Enhance Absolute Pose Regression with Direct Feature Matching | Apr 1, 2022 | Camera Pose EstimationCamera Relocalization | CodeCode Available | 1 |
| DP-MERF: Differentially Private Mean Embeddings with Random Features for Practical Privacy-Preserving Data Generation | Feb 26, 2020 | Privacy PreservingSensitivity | CodeCode Available | 1 |
| Controllable 3D Generative Adversarial Face Model via Disentangling Shape and Appearance | Aug 30, 2022 | 3D Face ModellingFace Model | CodeCode Available | 1 |
| A Comprehensive Survey of Synthetic Tabular Data Generation | Apr 23, 2025 | Privacy PreservingSurvey | CodeCode Available | 1 |
| Diffusion-HPC: Synthetic Data Generation for Human Mesh Recovery in Challenging Domains | Mar 16, 2023 | Human Mesh RecoverySynthetic Data Generation | CodeCode Available | 1 |
| dpart: Differentially Private Autoregressive Tabular, a General Framework for Synthetic Data Generation | Jul 12, 2022 | Synthetic Data Generation | CodeCode Available | 1 |
| dpmm: Differentially Private Marginal Models, a Library for Synthetic Tabular Data Generation | May 31, 2025 | Synthetic Data GenerationTabular Data Generation | CodeCode Available | 1 |
| CorGAN: Correlation-Capturing Convolutional Generative Adversarial Networks for Generating Synthetic Healthcare Records | Jan 25, 2020 | Disease PredictionGeneral Classification | CodeCode Available | 1 |
| ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval | May 27, 2025 | Image RetrievalRetrieval | CodeCode Available | 1 |
| CLIPPER: Compression enables long-context synthetic data generation | Feb 20, 2025 | Claim VerificationSynthetic Data Generation | CodeCode Available | 1 |
| Enhanced Sound Event Localization and Detection in Real 360-degree audio-visual soundscapes | Jan 29, 2024 | Data AugmentationSound Event Localization and Detection | CodeCode Available | 1 |
| Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement | Jan 21, 2025 | Synthetic Data GenerationWorld Knowledge | CodeCode Available | 1 |
| Exploring Transformer Text Generation for Medical Dataset Augmentation | May 1, 2020 | Synthetic Data GenerationText Generation | CodeCode Available | 1 |
| Differentially Private Synthetic Medical Data Generation using Convolutional GANs | Dec 22, 2020 | Deep Learningimage-classification | CodeCode Available | 1 |
| EEG Synthetic Data Generation Using Probabilistic Diffusion Models | Mar 6, 2023 | Brain Computer InterfaceData Augmentation | CodeCode Available | 1 |
| GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes | May 25, 2023 | Computed Tomography (CT)Image Generation | CodeCode Available | 1 |
| Copula-based synthetic data augmentation for machine-learning emulators | Dec 16, 2020 | BIG-bench Machine LearningData Augmentation | CodeCode Available | 1 |
| BLEUBERI: BLEU is a surprisingly effective reward for instruction following | May 16, 2025 | Instruction FollowingSynthetic Data Generation | CodeCode Available | 1 |
| Generating Synthetic Handwritten Historical Documents With OCR Constrained GANs | Mar 15, 2021 | Optical Character Recognition (OCR)Synthetic Data Generation | CodeCode Available | 1 |
| MEDIBENG WHISPER TINY: A FINE-TUNED CODE-SWITCHED BENGALI-ENGLISH TRANSLATOR FOR CLINICAL APPLICATIONS | Apr 25, 2025 | Clinical Language TranslationMachine Translation | CodeCode Available | 1 |
| GeoPointGAN: Synthetic Spatial Data with Local Label Differential Privacy | May 18, 2022 | ManagementPrivacy Preserving | CodeCode Available | 1 |
| Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to Parcel Logistics | Oct 18, 2022 | 3D Object Detection3D Reconstruction | CodeCode Available | 1 |
| Will we run out of data? Limits of LLM scaling based on human-generated data | Oct 26, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 |