Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3951–4000 of 4240 papers

Title	Date	Tasks	Status
Robust Knowledge Distillation in Federated Learning: Counteracting Backdoor Attacks	Feb 1, 2025	Federated LearningKnowledge Distillation	CodeCode Available
Students are the Best Teacher: Exit-Ensemble Distillation with Multi-Exits	Apr 1, 2021	ClassificationGeneral Classification	CodeCode Available
Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies	Apr 13, 2024	Few-Shot LearningKnowledge Distillation	CodeCode Available
NC-NCD: Novel Class Discovery for Node Classification	Jul 25, 2024	ClassificationKnowledge Distillation	CodeCode Available
Robust Model Compression Using Deep Hypotheses	Mar 13, 2021	Binary ClassificationKnowledge Distillation	CodeCode Available
Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks	Oct 2, 2024	Knowledge Distillation	CodeCode Available
Neighborhood Commonality-aware Evolution Network for Continuous Generalized Category Discovery	Dec 7, 2024	Contrastive LearningIncremental Learning	CodeCode Available
Robust Multimodal Segmentation with Representation Regularization and Hybrid Prototype Distillation	May 19, 2025	Knowledge DistillationSemantic Segmentation	CodeCode Available
Robustness and Diversity Seeking Data-Free Knowledge Distillation	Nov 7, 2020	Data-free Knowledge DistillationDiversity	CodeCode Available
FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question Answering	Dec 9, 2024	Knowledge DistillationQuestion Answering	CodeCode Available
Towards Disturbance-Free Visual Mobile Manipulation	Dec 17, 2021	Collision AvoidanceDeep Reinforcement Learning	CodeCode Available
Towards Diverse Device Heterogeneous Federated Learning via Task Arithmetic Knowledge Integration	Sep 27, 2024	Federated LearningKnowledge Distillation	CodeCode Available
Adversarial Teacher-Student Representation Learning for Domain Generalization	Dec 1, 2021	Data AugmentationDomain Generalization	CodeCode Available
Network Pruning via Transformable Architecture Search	May 23, 2019	Knowledge DistillationNetwork Pruning	CodeCode Available
Towards Effective Data-Free Knowledge Distillation via Diverse Diffusion Augmentation	Oct 23, 2024	Data-free Knowledge DistillationDiversity	CodeCode Available
Differentially Private Knowledge Distillation via Synthetic Text Generation	Mar 1, 2024	Knowledge DistillationModel Compression	CodeCode Available
Detect, Distill and Update: Learned DB Systems Facing Out of Distribution Data	Oct 11, 2022	Knowledge DistillationSynthetic Data Generation	CodeCode Available
ROD: Reception-aware Online Distillation for Sparse Graphs	Jul 25, 2021	ClusteringGraph Learning	CodeCode Available
Detect, Distill and Update: Detect, Distill and Update: Learned DB Systems Facing Out of Distribution Data	May 1, 2023	Knowledge DistillationSynthetic Data Generation	CodeCode Available
Neural Network Pruning with Residual-Connections and Limited-Data	Nov 19, 2019	Knowledge DistillationNetwork Pruning	CodeCode Available
FlowDistill: Scalable Traffic Flow Prediction via Distillation from LLMs	Apr 2, 2025	Knowledge DistillationPrediction	CodeCode Available
Fine-Grained Knowledge Selection and Restoration for Non-Exemplar Class Incremental Learning	Dec 20, 2023	class-incremental learningClass Incremental Learning	CodeCode Available
AuG-KD: Anchor-Based Mixup Generation for Out-of-Domain Knowledge Distillation	Mar 11, 2024	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available
New Insights on Relieving Task-Recency Bias for Online Class Incremental Learning	Feb 16, 2023	class-incremental learningClass Incremental Learning	CodeCode Available
Dense 2D-3D Indoor Prediction with Sound via Aligned Cross-Modal Distillation	Sep 20, 2023	3D Scene ReconstructionDepth Estimation	CodeCode Available
ROSAR: An Adversarial Re-Training Framework for Robust Side-Scan Sonar Object Detection	Oct 14, 2024	Knowledge Distillationobject-detection	CodeCode Available
Adversarial Moment-Matching Distillation of Large Language Models	Jun 5, 2024	Imitation LearningInstruction Following	CodeCode Available
Delta Distillation for Efficient Video Processing	Mar 17, 2022	Knowledge Distillationobject-detection	CodeCode Available
AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models	Oct 22, 2024	AttributeKnowledge Distillation	CodeCode Available
Few Shot Network Compression via Cross Distillation	Nov 21, 2019	Knowledge DistillationModel Compression	CodeCode Available
Sub-goal Distillation: A Method to Improve Small Language Agents	May 4, 2024	Imitation LearningKnowledge Distillation	CodeCode Available
Few-shot Class-Incremental Semantic Segmentation via Pseudo-Labeling and Knowledge Distillation	Aug 5, 2023	Class-Incremental Semantic SegmentationKnowledge Distillation	CodeCode Available
RUIE: Retrieval-based Unified Information Extraction using Large Language Model	Sep 18, 2024	Contrastive LearningIn-Context Learning	CodeCode Available
deepQuest-py: Large and Distilled Models for Quality Estimation	Nov 1, 2021	Knowledge DistillationSentence	CodeCode Available
WAVER: Writing-style Agnostic Text-Video Retrieval via Distilling Vision-Language Models Through Open-Vocabulary Knowledge	Dec 15, 2023	Information RetrievalKnowledge Distillation	CodeCode Available
Subspace Distillation for Continual Learning	Jul 31, 2023	Continual LearningKnowledge Distillation	CodeCode Available
A Forward and Backward Compatible Framework for Few-shot Class-incremental Pill Recognition	Apr 24, 2023	class-incremental learningClass Incremental Learning	CodeCode Available
"No Matter What You Do": Purifying GNN Models via Backdoor Unlearning	Oct 2, 2024	Backdoor Attackbackdoor defense	CodeCode Available
Integrating Translation Memories into Non-Autoregressive Machine Translation	Oct 12, 2022	Knowledge DistillationMachine Translation	CodeCode Available
Non-Autoregressive Neural Machine Translation	Nov 7, 2017	Knowledge DistillationMachine Translation	CodeCode Available
Uncertainty-Guided Cross Attention Ensemble Mean Teacher for Semi-supervised Medical Image Segmentation	Dec 19, 2024	Domain GeneralizationImage Segmentation	CodeCode Available
CDFKD-MFS: Collaborative Data-free Knowledge Distillation via Multi-level Feature Sharing	May 24, 2022	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available
Knowledge Distillation with Deep Supervision	Feb 16, 2022	Knowledge DistillationTransfer Learning	CodeCode Available
Catch-Up Distillation: You Only Need to Train Once for Accelerating Sampling	May 18, 2023	Knowledge Distillation	CodeCode Available
Few-Shot Class-Incremental Learning via Entropy-Regularized Data-Free Replay	Jul 22, 2022	class-incremental learningClass Incremental Learning	CodeCode Available
FedSiKD: Clients Similarity and Knowledge Distillation: Addressing Non-i.i.d. and Constraints in Federated Learning	Feb 14, 2024	Federated LearningKnowledge Distillation	CodeCode Available
Catastrophic Interference in Reinforcement Learning: A Solution Based on Context Division and Knowledge Distillation	Sep 1, 2021	Deep Reinforcement LearningGeneral Reinforcement Learning	CodeCode Available
Exploring Vacant Classes in Label-Skewed Federated Learning	Jan 4, 2024	Federated LearningKnowledge Distillation	CodeCode Available
SaiT: Sparse Vision Transformers through Adaptive Token Pruning	Oct 11, 2022	Knowledge Distillation	CodeCode Available
Not Far Away, Not So Close: Sample Efficient Nearest Neighbour Data Augmentation via MiniMax	May 28, 2021	Data AugmentationKnowledge Distillation	CodeCode Available

Show:10 25 50

← PrevPage 80 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified