SOTAVerified

Dataset Distillation

Dataset distillation is the task of synthesizing a small dataset such that models trained on it achieve high performance on the original large dataset. A dataset distillation algorithm takes as input a large real dataset to be distilled (training set), and outputs a small synthetic distilled dataset, which is evaluated via testing models trained on this distilled dataset on a separate real dataset (validation/test set). A good small distilled dataset is not only useful in dataset understanding, but has various applications (e.g., continual learning, privacy, neural architecture search, etc.).

Papers

Showing 101150 of 216 papers

TitleStatusHype
Does Training with Synthetic Data Truly Protect Privacy?Code0
AST: Effective Dataset Distillation through Alignment with Smooth and High-Quality Expert TrajectoriesCode0
Enhancing Dataset Distillation via Label Inconsistency Elimination and Learning Pattern RefinementCode0
Enhancing Dataset Distillation via Non-Critical Region RefinementCode0
Exploring Generalized Gait Recognition: Reducing Redundancy and Noise within Indoor and Outdoor DatasetsCode0
Exploring Multilingual Text Data DistillationCode0
Exploring the Impact of Dataset Bias on Dataset DistillationCode0
Federated Virtual Learning on Heterogeneous Data with Local-global DistillationCode0
Few-Shot Dataset Distillation via Translative Pre-TrainingCode0
Generative Dataset Distillation: Balancing Global Structure and Local DetailsCode0
Going Beyond Feature Similarity: Effective Dataset Distillation based on Class-Aware Conditional Mutual InformationCode0
Hyperbolic Dataset DistillationCode0
Image Distillation for Safe Data Sharing in HistopathologyCode0
Importance-Aware Adaptive Dataset DistillationCode0
MIM4DD: Mutual Information Maximization for Dataset DistillationCode0
Neural Spectral Decomposition for Dataset DistillationCode0
On Divergence Measures for Bayesian PseudocoresetsCode0
Rethinking Data Distillation: Do Not Overlook CalibrationCode0
Risk of Text Backdoor Attacks Under Dataset DistillationCode0
Sequential Subset Matching for Dataset DistillationCode0
Task-Specific Generative Dataset Distillation with Difficulty-Guided SamplingCode0
TD3: Tucker Decomposition Based Dataset Distillation Method for Sequential RecommendationCode0
Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated MatchingCode0
Towards Adversarially Robust Dataset Distillation by Curvature RegularizationCode0
Towards Mitigating Architecture Overfitting on Distilled DatasetsCode0
UniDetox: Universal Detoxification of Large Language Models via Dataset DistillationCode0
Dataset Meta-Learning from Kernel-Ridge Regression0
The Curse of Unrolling: Rate of Differentiating Through Optimization0
Dataset Distillation with Probabilistic Latent Features0
Generative Dataset Distillation Based on Self-knowledge Distillation0
Generative Dataset Distillation using Min-Max Diffusion Model0
Dataset Distillation via the Wasserstein Metric0
The Evolution of Dataset Distillation: Toward Scalable and Generalizable Solutions0
Dataset Distillation Meets Provable Subset Selection0
Heavy Labels Out! Dataset Distillation with Label Space Lightening0
Hierarchical Features Matter: A Deep Exploration of GAN Priors for Improved Dataset Distillation0
Hierarchical Features Matter: A Deep Exploration of Progressive Parameterization Method for Dataset Distillation0
Video Dataset Condensation with Diffusion Models0
Image Dataset Compression Based on Matrix Product States0
Towards Efficient Deep Hashing Retrieval: Condensing Your Data via Feature-Embedding Matching0
Beyond Modality Collapse: Representations Blending for Multimodal Dataset Distillation0
Dataset Distillation in Medical Imaging: A Feasibility Study0
Dataset Distillation in Latent Space0
Information-Guided Diffusion Sampling for Dataset Distillation0
Knowledge Distillation and Dataset Distillation of Large Language Models: Emerging Trends, Challenges, and Future Directions0
Knowledge Hierarchy Guided Biological-Medical Dataset Distillation for Domain LLM Training0
Label-Augmented Dataset Distillation0
Dataset Distillation from First Principles: Integrating Core Information Extraction and Purposeful Learning0
Latent Dataset Distillation with Diffusion Models0
Latent Video Dataset Distillation0
Show:102550
← PrevPage 3 of 5Next →

No leaderboard results yet.