| Chain-of-Translation Prompting (CoTR): A Novel Prompting Technique for Low Resource Languages | Sep 6, 2024 | Hate Speech DetectionSentiment Analysis | —Unverified | 0 |
| Boundless: Generating Photorealistic Synthetic Data for Object Detection in Urban Streetscapes | Sep 4, 2024 | Objectobject-detection | CodeCode Available | 0 |
| Building Math Agents with Multi-Turn Iterative Preference Learning | Sep 4, 2024 | GSM8KMath | —Unverified | 0 |
| Synthetic Data Generation and Automated Multidimensional Data Labeling for AI/ML in General and Circular Coordinates | Sep 3, 2024 | Outlier DetectionSynthetic Data Generation | —Unverified | 0 |
| Differentially Private Synthetic High-dimensional Tabular Stream | Aug 31, 2024 | Synthetic Data Generation | —Unverified | 0 |
| Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling | Aug 29, 2024 | DiversityKnowledge Distillation | —Unverified | 0 |
| Are Pose Estimators Ready for the Open World? STAGE: Synthetic Data Generation Toolkit for Auditing 3D Human Pose Estimators | Aug 28, 2024 | SensitivitySynthetic Data Generation | —Unverified | 0 |
| Efficient LLM Scheduling by Learning to Rank | Aug 28, 2024 | BlockingChatbot | CodeCode Available | 2 |
| LowCLIP: Adapting the CLIP Model Architecture for Low-Resource Languages in Multimodal Image Retrieval Task | Aug 25, 2024 | Computational EfficiencyImage Augmentation | CodeCode Available | 0 |
| A density ratio framework for evaluating the utility of synthetic data | Aug 23, 2024 | Density Ratio EstimationSynthetic Data Generation | —Unverified | 0 |
| Value Alignment from Unstructured Text | Aug 19, 2024 | Synthetic Data Generation | —Unverified | 0 |
| Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition | Aug 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Faster Private Minimum Spanning Trees | Aug 13, 2024 | Synthetic Data Generation | —Unverified | 0 |
| NFDI4Health workflow and service for synthetic data generation, assessment and risk management | Aug 8, 2024 | ManagementSynthetic Data Generation | —Unverified | 0 |
| HARMONIC: Harnessing LLMs for Tabular Data Synthesis and Privacy Protection | Aug 6, 2024 | Privacy PreservingSynthetic Data Generation | —Unverified | 0 |
| Winning Amazon KDD Cup'24 | Aug 5, 2024 | Data AugmentationMultiple-choice | —Unverified | 0 |
| Advancing Post-OCR Correction: A Comparative Study of Synthetic Data | Aug 5, 2024 | Optical Character Recognition (OCR)Synthetic Data Generation | CodeCode Available | 0 |
| VidModEx: Interpretable and Efficient Black Box Model Extraction for High-Dimensional Spaces | Aug 4, 2024 | image-classificationImage Classification | CodeCode Available | 0 |
| ABC Align: Large Language Model Alignment for Safety & Accuracy | Aug 1, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Algorithms for Collaborative Machine Learning under Statistical Heterogeneity | Jul 31, 2024 | Federated LearningSynthetic Data Generation | —Unverified | 0 |
| On the Problem of Text-To-Speech Model Selection for Synthetic Data Generation in Automatic Speech Recognition | Jul 31, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Optimizing Synthetic Data for Enhanced Pancreatic Tumor Segmentation | Jul 27, 2024 | Data AugmentationDecision Making | CodeCode Available | 0 |
| On the Effect of Purely Synthetic Training Data for Different Automatic Speech Recognition Architectures | Jul 25, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| VGGHeads: 3D Multi Head Alignment with a Large-Scale Synthetic Dataset | Jul 25, 2024 | Head DetectionKeypoint Estimation | CodeCode Available | 2 |
| Flexible Generation of Preference Data for Recommendation Analysis | Jul 23, 2024 | BenchmarkingRecommendation Systems | CodeCode Available | 0 |
| Boosting Reward Model with Preference-Conditional Multi-Aspect Synthetic Data Generation | Jul 22, 2024 | Synthetic Data Generation | —Unverified | 0 |
| Weak-to-Strong Compositional Learning from Generative Models for Language-based Object Detection | Jul 21, 2024 | Contrastive Learningobject-detection | —Unverified | 0 |
| MSCT: Addressing Time-Varying Confounding with Marginal Structural Causal Transformer for Counterfactual Post-Crash Traffic Prediction | Jul 19, 2024 | counterfactualPrediction | —Unverified | 0 |
| Unsupervised and Interpretable Synthesizing for Electrical Time Series Based on Information Maximizing Generative Adversarial Nets | Jul 18, 2024 | DescriptiveSynthetic Data Generation | —Unverified | 0 |
| LLMs-in-the-loop Part-1: Expert Small AI Models for Bio-Medical Text Translation | Jul 16, 2024 | ArticlesMachine Translation | —Unverified | 0 |
| Monocular pose estimation of articulated surgical instruments in open surgery | Jul 16, 2024 | 6D Pose EstimationDomain Adaptation | —Unverified | 0 |
| Zero-shot Cross-Lingual Transfer for Synthetic Data Generation in Grammatical Error Detection | Jul 16, 2024 | Cross-Lingual TransferGrammatical Error Detection | —Unverified | 0 |
| What Makes and Breaks Safety Fine-tuning? A Mechanistic Study | Jul 14, 2024 | Synthetic Data Generation | —Unverified | 0 |
| Convex space learning for tabular synthetic data generation | Jul 13, 2024 | Deep Learningimbalanced classification | CodeCode Available | 0 |
| Synthetic Data for Discriminating Serotonergic Neurons using Convolutional Neural Networks | Jul 8, 2024 | Synthetic Data Generation | —Unverified | 0 |
| When in Doubt, Cascade: Towards Building Efficient and Capable Guardrails | Jul 8, 2024 | Synthetic Data Generation | —Unverified | 0 |
| Synthetic Test Data Generation Using Recurrent Neural Networks: A Position Paper | Jul 7, 2024 | PositionSynthetic Data Generation | —Unverified | 0 |
| Enhancing Hallucination Detection through Perturbation-Based Synthetic Data Generation in System Responses | Jul 7, 2024 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| Diffusion Models for Tabular Data Imputation and Synthetic Data Generation | Jul 2, 2024 | DecoderDenoising | —Unverified | 0 |
| Synthetic Multimodal Question Generation | Jul 2, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| SK-VQA: Synthetic Knowledge Generation at Scale for Training Context-Augmented Multimodal LLMs | Jun 28, 2024 | RAGRetrieval-augmented Generation | —Unverified | 0 |
| UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models | Jun 27, 2024 | AttributeBenchmarking | CodeCode Available | 2 |
| Effects of Using Synthetic Data on Deep Recommender Models' Performance | Jun 26, 2024 | Data AugmentationRecommendation Systems | —Unverified | 0 |
| SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery | Jun 26, 2024 | Domain AdaptationEarth Observation | CodeCode Available | 2 |
| TSynD: Targeted Synthetic Data Generation for Enhanced Medical Image Classification | Jun 25, 2024 | image-classificationImage Classification | —Unverified | 0 |
| Principal Component Clustering for Semantic Segmentation in Synthetic Data Generation | Jun 25, 2024 | ClusteringImage Segmentation | —Unverified | 0 |
| Voice Disorder Analysis: a Transformer-based Approach | Jun 20, 2024 | Data AugmentationDiversity | CodeCode Available | 1 |
| Advancing Retail Data Science: Comprehensive Evaluation of Synthetic Data | Jun 19, 2024 | Demand ForecastingSynthetic Data Generation | —Unverified | 0 |
| Instruction Data Generation and Unsupervised Adaptation for Speech Language Models | Jun 18, 2024 | Synthetic Data Generationtext-to-speech | —Unverified | 0 |
| ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke | Jun 17, 2024 | Synthetic Data Generation | —Unverified | 0 |