Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1551–1575 of 4240 papers

Title	Date	Tasks	Status	Score
HDKD: Hybrid Data-Efficient Knowledge Distillation Network for Medical Image Classification	Jul 10, 2024	Computational Efficiencyimage-classification	CodeCode Available	5
Distilling Stereo Networks for Performant and Efficient Leaner Networks	Mar 24, 2025	General KnowledgeKnowledge Distillation	CodeCode Available	5
Greedy-layer Pruning: Speeding up Transformer Models for Natural Language Processing	May 31, 2021	Knowledge DistillationUnsupervised Pre-training	CodeCode Available	5
Graph Entropy Minimization for Semi-supervised Node Classification	May 31, 2023	ClassificationKnowledge Distillation	CodeCode Available	5
Graph Knowledge Distillation to Mixture of Experts	Jun 17, 2024	Knowledge DistillationMixture-of-Experts	CodeCode Available	5
Answering Diverse Questions via Text Attached with Key Audio-Visual Clues	Mar 11, 2024	Audio-visual Question AnsweringAudio-Visual Question Answering (AVQA)	CodeCode Available	5
Gradient Knowledge Distillation for Pre-trained Language Models	Nov 2, 2022	Knowledge Distillation	CodeCode Available	5
Distilling Object Detectors With Global Knowledge	Oct 17, 2022	Knowledge DistillationObject	CodeCode Available	5
Distilling Object Detectors with Fine-grained Feature Imitation	Jun 9, 2019	Knowledge DistillationObject	CodeCode Available	5
Proxy-Anchor and EVT-Driven Continual Learning Method for Generalized Category Discovery	Apr 11, 2025	Continual LearningKnowledge Distillation	CodeCode Available	5
Catch-Up Distillation: You Only Need to Train Once for Accelerating Sampling	May 18, 2023	Knowledge Distillation	CodeCode Available	5
Catastrophic Interference in Reinforcement Learning: A Solution Based on Context Division and Knowledge Distillation	Sep 1, 2021	Deep Reinforcement LearningGeneral Reinforcement Learning	CodeCode Available	5
Autoregressive Knowledge Distillation through Imitation Learning	Sep 15, 2020	Imitation LearningKnowledge Distillation	CodeCode Available	5
Feature Fusion for Online Mutual Knowledge Distillation	Apr 19, 2019	Knowledge Distillation	CodeCode Available	5
GOTHAM: Graph Class Incremental Learning Framework under Weak Supervision	Apr 7, 2025	Attributeclass-incremental learning	CodeCode Available	5
Graph-based Knowledge Distillation by Multi-head Attention Network	Jul 4, 2019	Inductive BiasKnowledge Distillation	CodeCode Available	5
Group Multi-View Transformer for 3D Shape Analysis with Spatial Encoding	Dec 27, 2023	3D Classification3D Shape Recognition	CodeCode Available	5
GNN's Uncertainty Quantification using Self-Distillation	Jun 24, 2025	Knowledge DistillationUncertainty Quantification	CodeCode Available	5
Feature Normalized Knowledge Distillation for Image Classification	Aug 1, 2020	ClassificationGeneral Classification	CodeCode Available	5
Feature Representation Learning for Robust Retinal Disease Detection from Optical Coherence Tomography Images	Jun 24, 2022	DecoderKnowledge Distillation	CodeCode Available	5
Distilling Reasoning Capabilities into Smaller Language Models	Dec 1, 2022	GSM8KKnowledge Distillation	CodeCode Available	5
Goal-Conditioned Q-Learning as Knowledge Distillation	Aug 28, 2022	Knowledge DistillationQ-Learning	CodeCode Available	5
Distilling Model Knowledge	Oct 8, 2015	Bayesian InferenceBIG-bench Machine Learning	CodeCode Available	5
Class incremental learning with probability dampening and cascaded gated classifier	Feb 2, 2024	class-incremental learningClass Incremental Learning	CodeCode Available	5
GLANCE: Global to Local Architecture-Neutral Concept-based Explanations	Jul 5, 2022	DisentanglementFeature Importance	CodeCode Available	5

Show:10 25 50

← PrevPage 63 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified