Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1951–2000 of 4240 papers

Title	Date	Tasks	Status
De-confounded Data-free Knowledge Distillation for Handling Distribution Shifts	Mar 28, 2024	Causal InferenceData-free Knowledge Distillation	—Unverified
Confidence-aware Self-Semantic Distillation on Knowledge Graph Embedding	Jun 7, 2022	Graph EmbeddingKnowledge Distillation	—Unverified
GVP: Generative Volumetric Primitives	Mar 31, 2023	Image GenerationKnowledge Distillation	—Unverified
Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding	Apr 20, 2019	Ensemble LearningKnowledge Distillation	—Unverified
Guiding Teacher Forcing with Seer Forcing for Neural Machine Translation	Jun 12, 2021	DecoderKnowledge Distillation	—Unverified
Bilateral Memory Consolidation for Continual Learning	Jan 1, 2023	Continual LearningKnowledge Distillation	—Unverified
Guiding CTC Posterior Spike Timings for Improved Posterior Fusion and Knowledge Distillation	Apr 17, 2019	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Improving Neural ODEs via Knowledge Distillation	Mar 10, 2022	Knowledge Distillation	—Unverified
Guided Deep Metric Learning	Jun 4, 2022	Few-Shot LearningKnowledge Distillation	—Unverified
GTCOM Neural Machine Translation Systems for WMT19	Aug 1, 2019	Knowledge DistillationLanguage Modeling	—Unverified
Decision Boundary-aware Knowledge Consolidation Generates Better Instance-Incremental Learner	Jun 5, 2024	class-incremental learningClass Incremental Learning	—Unverified
Improving Pronunciation and Accent Conversion through Knowledge Distillation And Synthetic Ground-Truth from Native TTS	Oct 19, 2024	Knowledge Distillation	—Unverified
Growing Deep Neural Network Considering with Similarity between Neurons	Aug 23, 2024	Decision MakingKnowledge Distillation	—Unverified
Decentralized and Model-Free Federated Learning: Consensus-Based Distillation in Function Space	Apr 1, 2021	Federated LearningKnowledge Distillation	—Unverified
Debias the Black-box: A Fair Ranking Framework via Knowledge Distillation	Aug 24, 2022	FairnessInformation Retrieval	—Unverified
Improving Route Choice Models by Incorporating Contextual Factors via Knowledge Distillation	Mar 27, 2019	Knowledge DistillationManagement	—Unverified
Always Strengthen Your Strengths: A Drift-Aware Incremental Learning Framework for CTR Prediction	Apr 17, 2023	Click-Through Rate PredictionDiversity	—Unverified
Adaptively Integrated Knowledge Distillation and Prediction Uncertainty for Continual Learning	Jan 18, 2023	Continual LearningKnowledge Distillation	—Unverified
A Closer Look at Knowledge Distillation with Features, Logits, and Gradients	Mar 18, 2022	Incremental LearningKnowledge Distillation	—Unverified
Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation	Aug 1, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
AdvFunMatch: When Consistent Teaching Meets Adversarial Robustness	May 24, 2023	Adversarial RobustnessKnowledge Distillation	—Unverified
Group-Mix SAM: Lightweight Solution for Industrial Assembly Line Applications	Mar 15, 2024	Knowledge Distillation	—Unverified
Debiased Distillation by Transplanting the Last Layer	Feb 22, 2023	AttributeKnowledge Distillation	—Unverified
Improving the Interpretability of Deep Neural Networks with Knowledge Distillation	Dec 28, 2018	EthicsKnowledge Distillation	—Unverified
Grouped Knowledge Distillation for Deep Face Recognition	Apr 10, 2023	Face RecognitionKnowledge Distillation	—Unverified
Group Distributionally Robust Knowledge Distillation	Nov 1, 2023	Knowledge Distillation	—Unverified
Improving Video Model Transfer With Dynamic Representation Learning	Jan 1, 2022	Action ClassificationKnowledge Distillation	—Unverified
Debate, Reflect, and Distill: Multi-Agent Feedback with Tree-Structured Preference Optimization for Efficient Language Model Enhancement	Jun 4, 2025	Knowledge DistillationLanguage Modeling	—Unverified
Group channel pruning and spatial attention distilling for object detection	Jun 2, 2023	Knowledge DistillationModel Compression	—Unverified
Improving Zero-Shot Multilingual Text Generation via Iterative Distillation	Oct 1, 2022	Knowledge DistillationText Generation	—Unverified
Ground-V: Teaching VLMs to Ground Complex Instructions in Pixels	May 20, 2025	Instruction FollowingKnowledge Distillation	—Unverified
In-Context Learning Distillation for Efficient Few-Shot Fine-Tuning	Dec 17, 2024	In-Context LearningKnowledge Distillation	—Unverified
DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers	Apr 27, 2022	Knowledge Distillation	—Unverified
GripRank: Bridging the Gap between Retrieval and Generation via the Generative Knowledge Improved Passage Ranking	May 29, 2023	Answer GenerationDialogue Generation	—Unverified
Dealing with training and test segmentation mismatch: FBK@IWSLT2021	Jun 23, 2021	Action DetectionActivity Detection	—Unverified
Incremental Classifier Learning Based on PEDCC-Loss and Cosine Distance	Jun 11, 2019	Incremental LearningKnowledge Distillation	—Unverified
Bi-CryptoNets: Leveraging Different-Level Privacy for Encrypted Inference	Feb 2, 2024	Knowledge DistillationPrivacy Preserving	—Unverified
Incremental Knowledge Based Question Answering	Jan 18, 2021	Incremental LearningKnowledge Distillation	—Unverified
Incremental Learning for End-to-End Automatic Speech Recognition	May 11, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
AKE-GNN: Effective Graph Learning with Adaptive Knowledge Exchange	Jun 10, 2021	ClassificationGraph Classification	—Unverified
Graph Representation Learning via Multi-task Knowledge Distillation	Nov 11, 2019	Graph Representation LearningKnowledge Distillation	—Unverified
Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs	Jul 27, 2023	Document ClassificationKnowledge Distillation	—Unverified
Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task through Knowledge Distillation	Apr 13, 2021	Knowledge DistillationTriplet	—Unverified
DDK: Distilling Domain Knowledge for Efficient Large Language Models	Jul 23, 2024	Knowledge Distillation	—Unverified
DCSNet: A Lightweight Knowledge Distillation-Based Model with Explainable AI for Lung Cancer Diagnosis from Histopathological Images	May 14, 2025	DiagnosticKnowledge Distillation	—Unverified
Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge Transfer	Oct 9, 2020	Decoderimage-classification	—Unverified
Incrementer: Transformer for Class-Incremental Semantic Segmentation With Knowledge Distillation Focusing on Old Class	Jan 1, 2023	Class-Incremental Semantic SegmentationDecoder	—Unverified
ALP-KD: Attention-Based Layer Projection for Knowledge Distillation	Dec 27, 2020	Knowledge Distillation	—Unverified
Adaptive Label Smoothing with Self-Knowledge in Natural Language Generation	Oct 22, 2022	Knowledge DistillationText Generation	—Unverified
DC-CCL: Device-Cloud Collaborative Controlled Learning for Large Vision Models	Mar 18, 2023	Knowledge Distillation	—Unverified

Show:10 25 50

← PrevPage 40 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified