SOTAVerified|Agents Browse Leaderboard About

Dataset Distillation

Dataset distillation is the task of synthesizing a small dataset such that models trained on it achieve high performance on the original large dataset. A dataset distillation algorithm takes as input a large real dataset to be distilled (training set), and outputs a small synthetic distilled dataset, which is evaluated via testing models trained on this distilled dataset on a separate real dataset (validation/test set). A good small distilled dataset is not only useful in dataset understanding, but has various applications (e.g., continual learning, privacy, neural architecture search, etc.).

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 191–200 of 216 papers

Title	Date	Tasks	Status
Does Training with Synthetic Data Truly Protect Privacy?	Feb 18, 2025	Data-free Knowledge DistillationDataset Distillation	CodeCode Available
Risk of Text Backdoor Attacks Under Dataset Distillation	Oct 17, 2024	Backdoor AttackDataset Distillation	CodeCode Available
Diversity-Driven Synthesis: Enhancing Dataset Distillation through Directed Weight Adjustment	Sep 26, 2024	Dataset DistillationDiversity	CodeCode Available
Towards Adversarially Robust Dataset Distillation by Curvature Regularization	Mar 15, 2024	Adversarial RobustnessDataset Distillation	CodeCode Available
Neural Spectral Decomposition for Dataset Distillation	Aug 29, 2024	Dataset Distillation	CodeCode Available
Distributional Dataset Distillation with Subtask Decomposition	Mar 1, 2024	Dataset DistillationDecoder	CodeCode Available
Going Beyond Feature Similarity: Effective Dataset Distillation based on Class-Aware Conditional Mutual Information	Dec 13, 2024	Dataset Distillation	CodeCode Available
Distill the Best, Ignore the Rest: Improving Dataset Distillation with Loss-Value-Based Pruning	Nov 18, 2024	Dataset Distillation	CodeCode Available
UniDetox: Universal Detoxification of Large Language Models via Dataset Distillation	Apr 29, 2025	Dataset DistillationLanguage Modeling	CodeCode Available
Dataset Distillation via Adversarial Prediction Matching	Dec 14, 2023	Dataset DistillationGPU	CodeCode Available

Show:10 25 50

← PrevPage 20 of 22Next →

No leaderboard results yet.