Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 551–575 of 4240 papers

Title	Date	Tasks	Status	Hype
Class Attention Transfer Based Knowledge Distillation	Apr 25, 2023	Knowledge DistillationModel Compression	CodeCode Available	1
Class-Balanced Distillation for Long-Tailed Visual Recognition	Apr 12, 2021	Image ClassificationKnowledge Distillation	CodeCode Available	1
Categorical Relation-Preserving Contrastive Knowledge Distillation for Medical Image Classification	Jul 7, 2021	Classificationimage-classification	CodeCode Available	1
DeepKD: A Deeply Decoupled and Denoised Knowledge Distillation Trainer	May 21, 2025	DenoisingKnowledge Distillation	CodeCode Available	1
Anti-Distillation Backdoor Attacks: Backdoors Can Really Survive in Knowledge Distillation	Oct 24, 2021	Backdoor AttackKnowledge Distillation	CodeCode Available	1
Defocus Blur Detection via Depth Distillation	Jul 16, 2020	DecoderDefocus Blur Detection	CodeCode Available	1
Deliberated Domain Bridging for Domain Adaptive Semantic Segmentation	Sep 16, 2022	Domain AdaptationImage-to-Image Translation	CodeCode Available	1
Deliberation on Priors: Trustworthy Reasoning of Large Language Models on Knowledge Graphs	May 21, 2025	Knowledge DistillationKnowledge Graphs	CodeCode Available	1
Tracking-by-Trackers with a Distilled and Reinforced Model	Jul 8, 2020	Knowledge DistillationObject Tracking	CodeCode Available	1
CCL: Continual Contrastive Learning for LiDAR Place Recognition	Mar 24, 2023	Autonomous DrivingContinual Learning	CodeCode Available	1
Dialogue Chain-of-Thought Distillation for Commonsense-aware Conversational Agents	Oct 13, 2023	InformativenessKnowledge Distillation	CodeCode Available	1
DialoKG: Knowledge-Structure Aware Task-Oriented Dialogue Generation	Apr 19, 2022	Dialogue GenerationKnowledge Distillation	CodeCode Available	1
Cumulative Spatial Knowledge Distillation for Vision Transformers	Jul 17, 2023	Inductive BiasKnowledge Distillation	CodeCode Available	1
DiGA: Distil to Generalize and then Adapt for Domain Adaptive Semantic Segmentation	Apr 5, 2023	Data AugmentationKnowledge Distillation	CodeCode Available	1
CEKD: Cross-Modal Edge-Privileged Knowledge Distillation for Semantic Scene Understanding Using Only Thermal Images	Feb 22, 2023	Knowledge DistillationScene Understanding	CodeCode Available	1
CEN-HDR: Computationally Efficient neural Network for real-time High Dynamic Range imaging	Feb 10, 2023	Efficient Neural NetworkKnowledge Distillation	CodeCode Available	1
A Dual-Space Framework for General Knowledge Distillation of Large Language Models	Apr 15, 2025	Code GenerationGeneral Knowledge	CodeCode Available	1
Directed Acyclic Transformer for Non-Autoregressive Machine Translation	May 16, 2022	Knowledge DistillationMachine Translation	CodeCode Available	1
DistilCSE: Effective Knowledge Distillation For Contrastive Sentence Embeddings	Dec 10, 2021	Contrastive LearningKnowledge Distillation	CodeCode Available	1
Channel-Aware Distillation Transformer for Depth Estimation on Nano Drones	Mar 18, 2023	Autonomous NavigationDepth Estimation	CodeCode Available	1
DARTS: Double Attention Reference-based Transformer for Super-resolution	Jul 17, 2023	Image Super-ResolutionKnowledge Distillation	CodeCode Available	1
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	Oct 2, 2019	Hate Speech DetectionKnowledge Distillation	CodeCode Available	1
Channel Gating Neural Networks	May 29, 2018	Knowledge DistillationNetwork Pruning	CodeCode Available	1
Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking	Apr 4, 2025	Document RankingInformation Retrieval	CodeCode Available	1
Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection	Nov 14, 2022	3D Object DetectionKnowledge Distillation	CodeCode Available	1

Show:10 25 50

← PrevPage 23 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified