Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1526–1550 of 4240 papers

Title	Date	Tasks	Status	Score
GSB: Group Superposition Binarization for Vision Transformer with Limited Training Samples	May 13, 2023	BinarizationKnowledge Distillation	CodeCode Available	5
A Dual-Contrastive Framework for Low-Resource Cross-Lingual Named Entity Recognition	Apr 2, 2022	Contrastive LearningCross-Lingual NER	CodeCode Available	5
Distilling the Undistillable: Learning from a Nasty Teacher	Oct 21, 2022	Knowledge Distillation	CodeCode Available	5
Group Multi-View Transformer for 3D Shape Analysis with Spatial Encoding	Dec 27, 2023	3D Classification3D Shape Recognition	CodeCode Available	5
GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning	Oct 20, 2024	Image RetrievalImage-text Retrieval	CodeCode Available	5
Distilling the Knowledge of Romanian BERTs Using Multiple Teachers	Dec 23, 2021	Dialect IdentificationGPU	CodeCode Available	5
Distilling the Knowledge of Large-scale Generative Models into Retrieval Models for Efficient Open-domain Conversation	Aug 28, 2021	Knowledge DistillationRetrieval	CodeCode Available	5
CDFKD-MFS: Collaborative Data-free Knowledge Distillation via Multi-level Feature Sharing	May 24, 2022	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available	5
An Unsupervised Multiple-Task and Multiple-Teacher Model for Cross-lingual Named Entity Recognition	Nov 16, 2021	Cross-Lingual NERKnowledge Distillation	CodeCode Available	5
Greedy-layer Pruning: Speeding up Transformer Models for Natural Language Processing	May 31, 2021	Knowledge DistillationUnsupervised Pre-training	CodeCode Available	5
Graph Knowledge Distillation to Mixture of Experts	Jun 17, 2024	Knowledge DistillationMixture-of-Experts	CodeCode Available	5
Handling Data Heterogeneity in Federated Learning via Knowledge Distillation and Fusion	Jul 23, 2022	Data-free Knowledge DistillationFairness	CodeCode Available	5
Dynamic Data-Free Knowledge Distillation by Easy-to-Hard Learning Strategy	Aug 29, 2022	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available	5
Distilling Stereo Networks for Performant and Efficient Leaner Networks	Mar 24, 2025	General KnowledgeKnowledge Distillation	CodeCode Available	5
Graph-based Knowledge Distillation by Multi-head Attention Network	Jul 4, 2019	Inductive BiasKnowledge Distillation	CodeCode Available	5
Cooperative Classification and Rationalization for Graph Generalization	Mar 10, 2024	ClassificationGraph Classification	CodeCode Available	5
Gradient Knowledge Distillation for Pre-trained Language Models	Nov 2, 2022	Knowledge Distillation	CodeCode Available	5
Graph Entropy Minimization for Semi-supervised Node Classification	May 31, 2023	ClassificationKnowledge Distillation	CodeCode Available	5
Spending Your Winning Lottery Better After Drawing It	Jan 8, 2021	Knowledge Distillation	CodeCode Available	5
GOTHAM: Graph Class Incremental Learning Framework under Weak Supervision	Apr 7, 2025	Attributeclass-incremental learning	CodeCode Available	5
Goldfish: An Efficient Federated Unlearning Framework	Apr 4, 2024	Knowledge DistillationMachine Unlearning	CodeCode Available	5
Answering Diverse Questions via Text Attached with Key Audio-Visual Clues	Mar 11, 2024	Audio-visual Question AnsweringAudio-Visual Question Answering (AVQA)	CodeCode Available	5
GNN's Uncertainty Quantification using Self-Distillation	Jun 24, 2025	Knowledge DistillationUncertainty Quantification	CodeCode Available	5
Distilling Object Detectors With Global Knowledge	Oct 17, 2022	Knowledge DistillationObject	CodeCode Available	5
Goal-Conditioned Q-Learning as Knowledge Distillation	Aug 28, 2022	Knowledge DistillationQ-Learning	CodeCode Available	5

Show:10 25 50

← PrevPage 62 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified