Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1651–1700 of 4240 papers

Title	Date	Tasks	Status	Score
Distilling Image Dehazing With Heterogeneous Task Imitation	Jun 1, 2020	image-classificationImage Classification	CodeCode Available	5
GSB: Group Superposition Binarization for Vision Transformer with Limited Training Samples	May 13, 2023	BinarizationKnowledge Distillation	CodeCode Available	5
Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation	Jun 12, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	5
SFT-KD-Recon: Learning a Student-friendly Teacher for Knowledge Distillation in Magnetic Resonance Image Reconstruction	Apr 11, 2023	Image ReconstructionKnowledge Distillation	CodeCode Available	5
Greedy-layer Pruning: Speeding up Transformer Models for Natural Language Processing	May 31, 2021	Knowledge DistillationUnsupervised Pre-training	CodeCode Available	5
Distilling Global and Local Logits With Densely Connected Relations	Jan 1, 2021	image-classificationImage Classification	CodeCode Available	5
Cross-View Consistency Regularisation for Knowledge Distillation	Dec 21, 2024	Knowledge Distillation	CodeCode Available	5
FM2DS: Few-Shot Multimodal Multihop Data Synthesis with Knowledge Distillation for Question Answering	Dec 9, 2024	Knowledge DistillationQuestion Answering	CodeCode Available	5
Graph Knowledge Distillation to Mixture of Experts	Jun 17, 2024	Knowledge DistillationMixture-of-Experts	CodeCode Available	5
Distilling Focal Knowledge From Imperfect Expert for 3D Object Detection	Jan 1, 2023	3D geometry3D Object Detection	CodeCode Available	5
Camera-Incremental Object Re-Identification with Identity Knowledge Evolution	May 25, 2023	Knowledge DistillationObject	CodeCode Available	5
Graph Entropy Minimization for Semi-supervised Node Classification	May 31, 2023	ClassificationKnowledge Distillation	CodeCode Available	5
Graph-based Knowledge Distillation by Multi-head Attention Network	Jul 4, 2019	Inductive BiasKnowledge Distillation	CodeCode Available	5
Gradient Knowledge Distillation for Pre-trained Language Models	Nov 2, 2022	Knowledge Distillation	CodeCode Available	5
Group Multi-View Transformer for 3D Shape Analysis with Spatial Encoding	Dec 27, 2023	3D Classification3D Shape Recognition	CodeCode Available	5
Annealing Knowledge Distillation	Apr 14, 2021	image-classificationImage Classification	CodeCode Available	5
Foundation Models for Structural Health Monitoring	Apr 3, 2024	Anomaly DetectionKnowledge Distillation	CodeCode Available	5
Distilling and Transferring Knowledge via cGAN-generated Samples for Image Classification and Regression	Apr 7, 2021	General Classificationimage-classification	CodeCode Available	5
Spending Your Winning Lottery Better After Drawing It	Jan 8, 2021	Knowledge Distillation	CodeCode Available	5
Goal-Conditioned Q-Learning as Knowledge Distillation	Aug 28, 2022	Knowledge DistillationQ-Learning	CodeCode Available	5
A Comprehensive Overhaul of Feature Distillation	Apr 3, 2019	General Classificationimage-classification	CodeCode Available	5
GNN's Uncertainty Quantification using Self-Distillation	Jun 24, 2025	Knowledge DistillationUncertainty Quantification	CodeCode Available	5
Goldfish: An Efficient Federated Unlearning Framework	Apr 4, 2024	Knowledge DistillationMachine Unlearning	CodeCode Available	5
GLiRA: Black-Box Membership Inference Attack via Knowledge Distillation	May 13, 2024	image-classificationImage Classification	CodeCode Available	5
Distilled Non-Semantic Speech Embeddings with Binary Neural Networks for Low-Resource Devices	Jul 12, 2022	Emotion RecognitionKeyword Spotting	CodeCode Available	5
Distilled Neural Networks for Efficient Learning to Rank	Feb 22, 2022	CPUInformation Retrieval	CodeCode Available	5
CADIS: Handling Cluster-skewed Non-IID Data in Federated Learning with Clustered Aggregation and Knowledge DIStilled Regularization	Feb 21, 2023	Federated LearningKnowledge Distillation	CodeCode Available	5
Distilled Gradual Pruning with Pruned Fine-tuning	Feb 15, 2024	Image ClassificationKnowledge Distillation	CodeCode Available	5
GKT: A Novel Guidance-Based Knowledge Transfer Framework For Efficient Cloud-edge Collaboration LLM Deployment	May 30, 2024	GSM8KKnowledge Distillation	CodeCode Available	5
Distilled GPT for Source Code Summarization	Aug 28, 2023	Code SummarizationGPU	CodeCode Available	5
cViL: Cross-Lingual Training of Vision-Language Models using Knowledge Distillation	Jun 7, 2022	Knowledge DistillationQuestion Answering	CodeCode Available	5
Preference-Consistent Knowledge Distillation for Recommender System	Nov 8, 2023	Knowledge DistillationRecommendation Systems	CodeCode Available	5
A Diffusion Model and Knowledge Distillation Framework for Robust Coral Detection in Complex Underwater Environments	Jan 6, 2025	2D Object DetectionKnowledge Distillation	CodeCode Available	5
GKD: Semi-supervised Graph Knowledge Distillation for Graph-Independent Inference	Apr 8, 2021	Disease Predictiongraph construction	CodeCode Available	5
GLANCE: Global to Local Architecture-Neutral Concept-based Explanations	Jul 5, 2022	DisentanglementFeature Importance	CodeCode Available	5
GOTHAM: Graph Class Incremental Learning Framework under Weak Supervision	Apr 7, 2025	Attributeclass-incremental learning	CodeCode Available	5
Distill-DBDGAN: Knowledge Distillation and Adversarial Learning Framework for Defocus Blur Detection	Feb 1, 2023	Defocus Blur DetectionGenerative Adversarial Network	CodeCode Available	5
DistillCSE: Distilled Contrastive Learning for Sentence Embeddings	Oct 20, 2023	Contrastive LearningKnowledge Distillation	CodeCode Available	5
Exploiting the Semantic Knowledge of Pre-trained Text-Encoders for Continual Learning	Aug 2, 2024	Continual LearningKnowledge Distillation	CodeCode Available	5
Generative Denoise Distillation: Simple Stochastic Noises Induce Efficient Knowledge Transfer for Dense Prediction	Jan 16, 2024	Instance SegmentationKnowledge Distillation	CodeCode Available	5
DAdEE: Unsupervised Domain Adaptation in Early Exit PLMs	Oct 6, 2024	Domain AdaptationKnowledge Distillation	CodeCode Available	5
BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers	Aug 12, 2022	image-classificationImage Classification	CodeCode Available	5
FS-BAN: Born-Again Networks for Domain Generalization Few-Shot Classification	Aug 23, 2022	Domain GeneralizationKnowledge Distillation	CodeCode Available	5
DAD++: Improved Data-free Test Time Adversarial Defense	Sep 10, 2023	Adversarial DefenseAdversarial Robustness	CodeCode Available	5
Distillation Techniques for Pseudo-rehearsal Based Incremental Learning	Jul 8, 2018	Incremental LearningKnowledge Distillation	CodeCode Available	5
Generalizing Teacher Networks for Effective Knowledge Distillation Across Student Architectures	Jul 22, 2024	Knowledge DistillationModel Compression	CodeCode Available	5
Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation	Mar 26, 2023	Knowledge Distillation	CodeCode Available	5
Structural Knowledge Distillation: Tractably Distilling Information for Structured Predictor	Oct 10, 2020	Dependency ParsingKnowledge Distillation	CodeCode Available	5
Generalized Knowledge Distillation via Relationship Matching	May 4, 2022	Few-Shot LearningIncremental Learning	CodeCode Available	5
Generate, Annotate, and Learn: NLP with Synthetic Text	Jun 11, 2021	Few-Shot LearningImage Classification	CodeCode Available	5

Show:10 25 50

← PrevPage 34 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified