Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3651–3675 of 4240 papers

Title	Date	Tasks	Status
Joint Pre-training and Local Re-training: Transferable Representation Learning on Multi-source Knowledge Graphs	Jun 5, 2023	Entity AlignmentKnowledge Distillation	CodeCode Available
Leveraging Entity Information for Cross-Modality Correlation Learning: The Entity-Guided Multimodal Summarization	Aug 6, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available
DMSSN: Distilled Mixed Spectral-Spatial Network for Hyperspectral Salient Object Detection	Mar 31, 2024	Dimensionality ReductionKnowledge Distillation	CodeCode Available
TIE-KD: Teacher-Independent and Explainable Knowledge Distillation for Monocular Depth Estimation	Feb 22, 2024	Depth EstimationKnowledge Distillation	CodeCode Available
A Knowledge Distillation Ensemble Framework for Predicting Short and Long-term Hospitalisation Outcomes from Electronic Health Records Data	Nov 18, 2020	Decision MakingICU Admission	CodeCode Available
SMOTExT: SMOTE meets Large Language Models	May 19, 2025	Cross-Modal RetrievalData Augmentation	CodeCode Available
Random Path Selection for Continual Learning	Dec 1, 2019	Continual LearningIncremental Learning	CodeCode Available
An Adaptive Random Path Selection Approach for Incremental Learning	Jun 3, 2019	Incremental LearningKnowledge Distillation	CodeCode Available
A Knowledge Distillation-Based Approach to Enhance Transparency of Classifier Models	Feb 21, 2025	Decision MakingKnowledge Distillation	CodeCode Available
Time-Series Forecasting, Knowledge Distillation, and Refinement within a Multimodal PDE Foundation Model	Sep 17, 2024	Knowledge DistillationOperator learning	CodeCode Available
Joint Answering and Explanation for Visual Commonsense Reasoning	Feb 25, 2022	Knowledge DistillationQuestion Answering	CodeCode Available
Being Strong Progressively! Enhancing Knowledge Distillation of Large Language Models through a Curriculum Learning Framework	Jun 6, 2025	Instruction FollowingKnowledge Distillation	CodeCode Available
Comparative Knowledge Distillation	Nov 3, 2023	Data AugmentationKnowledge Distillation	CodeCode Available
Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation	Mar 27, 2024	Domain AdaptationKnowledge Distillation	CodeCode Available
Distribution Aligned Semantics Adaption for Lifelong Person Re-Identification	May 30, 2024	Knowledge DistillationPerson Re-Identification	CodeCode Available
Low-Complexity Acoustic Scene Classification Using Parallel Attention-Convolution Network	Jun 12, 2024	Acoustic Scene ClassificationData Augmentation	CodeCode Available
BEBERT: Efficient and Robust Binary Ensemble BERT	Oct 28, 2022	BinarizationComputational Efficiency	CodeCode Available
Ranking Distillation: Learning Compact Ranking Models With High Performance for Recommender System	Sep 19, 2018	Knowledge DistillationLearning-To-Rank	CodeCode Available
Beyond Answers: Transferring Reasoning Capabilities to Smaller LLMs Using Multi-Teacher Knowledge Distillation	Feb 7, 2024	DiversityKnowledge Distillation	CodeCode Available
Invariant debiasing learning for recommendation via biased imputation	Dec 28, 2024	ImputationKnowledge Distillation	CodeCode Available
Adaptive Teaching with Shared Classifier for Knowledge Distillation	Jun 12, 2024	Knowledge Distillation	CodeCode Available
Distributed Soft Actor-Critic with Multivariate Reward Representation and Knowledge Distillation	Nov 29, 2019	Knowledge Distillationreinforcement-learning	CodeCode Available
Compact Trilinear Interaction for Visual Question Answering	Sep 26, 2019	BenchmarkingKnowledge Distillation	CodeCode Available
CoMoTo: Unpaired Cross-Modal Lesion Distillation Improves Breast Lesion Detection in Tomosynthesis	Jul 24, 2024	Knowledge DistillationLesion Detection	CodeCode Available
Adaptive Search-and-Training for Robust and Efficient Network Pruning	Feb 1, 2023	Knowledge DistillationNetwork Pruning	CodeCode Available

Show:10 25 50

← PrevPage 147 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified