Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3901–3925 of 4240 papers

Title	Date	Tasks	Status
Label Semantic Knowledge Distillation for Unbiased Scene Graph Generation	Aug 7, 2022	Graph GenerationKnowledge Distillation	—Unverified
LadaBERT: Lightweight Adaptation of BERT through Hybrid Model Compression	Apr 8, 2020	BlockingKnowledge Distillation	—Unverified
LaDiMo: Layer-wise Distillation Inspired MoEfier	Aug 8, 2024	Knowledge DistillationMixture-of-Experts	—Unverified
LAKD-Activation Mapping Distillation Based on Local Learning	Aug 21, 2024	Knowledge Distillation	—Unverified
LAMeTA: Intent-Aware Agentic Network Optimization via a Large AI Model-Empowered Two-Stage Approach	May 18, 2025	Deep Reinforcement LearningKnowledge Distillation	—Unverified
Language Graph Distillation for Low-Resource Machine Translation	Aug 17, 2019	Knowledge DistillationMachine Translation	—Unverified
Language Modelling via Learning to Rank	Oct 13, 2021	Knowledge DistillationLanguage Modelling	—Unverified
Language-Oriented Communication with Semantic Coding and Knowledge Distillation for Text-to-Image Generation	Sep 20, 2023	Image GenerationIn-Context Learning	—Unverified
LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models	Apr 17, 2024	Knowledge Distillation	—Unverified
Just CHOP: Embarrassingly Simple LLM Compression	May 24, 2023	Knowledge DistillationLanguage Modeling	—Unverified
Large Language Model Guided Knowledge Distillation for Time Series Anomaly Detection	Jan 26, 2024	Anomaly DetectionKnowledge Distillation	—Unverified
Large Language Model Meets Graph Neural Network in Knowledge Distillation	Feb 8, 2024	Contrastive LearningGraph Attention	—Unverified
Large Model for Small Data: Foundation Model for Cross-Modal RF Human Activity Recognition	Oct 13, 2024	Activity RecognitionFew-Shot Learning	—Unverified
Large-Scale Generative Data-Free Distillation	Dec 10, 2020	Knowledge DistillationModel Compression	—Unverified
LaSNN: Layer-wise ANN-to-SNN Distillation for Effective and Efficient Training in Deep Spiking Neural Networks	Apr 17, 2023	Knowledge Distillation	—Unverified
Layer Attack Unlearning: Fast and Accurate Machine Unlearning via Layer Level Attack and Knowledge Distillation	Dec 28, 2023	Knowledge DistillationMachine Unlearning	—Unverified
LayerCollapse: Adaptive compression of neural networks	Nov 29, 2023	Computational Efficiencyimage-classification	—Unverified
Layer Importance for Mathematical Reasoning is Forged in Pre-Training and Invariant after Post-Training	Jun 27, 2025	Knowledge DistillationMathematical Reasoning	—Unverified
Layerwise Bregman Representation Learning with Applications to Knowledge Distillation	Sep 15, 2022	Knowledge DistillationRepresentation Learning	—Unverified
Noisy Data Meets Privacy: Training Local Models with Post-Processed Remote Queries	May 25, 2024	Knowledge DistillationModel extraction	—Unverified
LEAD: Liberal Feature-based Distillation for Dense Retrieval	Dec 10, 2022	Document RankingKnowledge Distillation	—Unverified
LEALLA: Learning Lightweight Language-agnostic Sentence Embeddings with Knowledge Distillation	Feb 16, 2023	Knowledge DistillationSentence	—Unverified
Learnable Cross-modal Knowledge Distillation for Multi-modal Learning with Missing Modality	Oct 2, 2023	Knowledge Distillation	—Unverified
Learn from Balance: Rectifying Knowledge Transfer for Long-Tailed Scenarios	Sep 12, 2024	Knowledge DistillationTransfer Learning	—Unverified
Learn From the Past: Experience Ensemble Knowledge Distillation	Feb 25, 2022	Knowledge DistillationTransfer Learning	—Unverified

Show:10 25 50

← PrevPage 157 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified