Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2751–2775 of 4240 papers

Title	Date	Tasks	Status
A Survey on Model Compression for Large Language Models	Aug 15, 2023	BenchmarkingKnowledge Distillation	—Unverified
A Survey on Recent Teacher-student Learning Studies	Apr 10, 2023	Knowledge DistillationSurvey	—Unverified
A Survey on Symbolic Knowledge Distillation of Large Language Models	Jul 12, 2024	Knowledge DistillationSurvey	—Unverified
A Survey on Transformer Compression	Feb 5, 2024	Knowledge DistillationMamba	—Unverified
Asymmetric Decision-Making in Online Knowledge Distillation:Unifying Consensus and Divergence	Mar 9, 2025	Decision MakingKnowledge Distillation	—Unverified
ADPS: Asymmetric Distillation Post-Segmentation for Image Anomaly Detection	Oct 19, 2022	Anomaly DetectionAnomaly Localization	—Unverified
Asymmetric Image Retrieval with Cross Model Compatible Ensembles	Mar 30, 2023	DiversityFace Recognition	—Unverified
Asymmetric Temperature Scaling Makes Larger Networks Teach Well Again	Oct 10, 2022	Knowledge Distillation	—Unverified
Asynchronous Convergence in Multi-Task Learning via Knowledge Distillation from Converged Tasks	Jul 1, 2022	Knowledge DistillationMulti-Task Learning	—Unverified
Edge Bias in Federated Learning and its Solution by Buffered Knowledge Distillation	Oct 20, 2020	Federated LearningKnowledge Distillation	—Unverified
A Technical Study into Small Reasoning Language Models	Jun 16, 2025	Code GenerationComputational Efficiency	—Unverified
A Theoretical Analysis of Soft-Label vs Hard-Label Training in Neural Networks	Dec 12, 2024	Binary ClassificationKnowledge Distillation	—Unverified
A Transformer-in-Transformer Network Utilizing Knowledge Distillation for Image Recognition	Feb 24, 2025	image-classificationImage Classification	—Unverified
Attention-Guided Answer Distillation for Machine Reading Comprehension	Aug 23, 2018	Knowledge DistillationMachine Reading Comprehension	—Unverified
Attention-guided Feature Distillation for Semantic Segmentation	Mar 8, 2024	Knowledge DistillationSegmentation	—Unverified
Attention is all you need for boosting graph convolutional neural network	Mar 10, 2024	AllKnowledge Distillation	—Unverified
AttentionLite: Towards Efficient Self-Attention Models for Vision	Dec 21, 2020	Knowledge Distillation	—Unverified
MKD: a Multi-Task Knowledge Distillation Approach for Pretrained Language Models	Nov 9, 2019	Knowledge DistillationMulti-Task Learning	—Unverified
Audio-Oriented Multimodal Machine Comprehension: Task, Dataset and Model	Jul 4, 2021	Knowledge DistillationMachine Reading Comprehension	—Unverified
Audio Representation Learning by Distilling Video as Privileged Information	Feb 6, 2023	Emotion RecognitionKnowledge Distillation	—Unverified
Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation	Oct 21, 2022	Data AugmentationDiversity	—Unverified
Augmenting Knowledge Distillation With Peer-To-Peer Mutual Learning For Model Compression	Oct 21, 2021	Knowledge DistillationModel Compression	—Unverified
A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation	Apr 2, 2023	Face GenerationKnowledge Distillation	—Unverified
A Unified Framework for Continual Learning and Unlearning	Aug 21, 2024	Continual LearningKnowledge Distillation	—Unverified
A Unified Knowledge-Distillation and Semi-Supervised Learning Framework to Improve Industrial Ads Delivery Systems	Feb 5, 2025	Knowledge Distillation	—Unverified

Show:10 25 50

← PrevPage 111 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified