Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 326–350 of 4240 papers

Title	Date	Tasks	Status	Hype
Action knowledge for video captioning with graph neural networks	Mar 16, 2023	Action RecognitionGraph Neural Network	CodeCode Available	1
Complementary Relation Contrastive Distillation	Mar 29, 2021	Knowledge DistillationRelation	CodeCode Available	1
Prototype-based Incremental Few-Shot Semantic Segmentation	Nov 30, 2020	Few-Shot Semantic SegmentationIncremental Learning	CodeCode Available	1
Conformer and Blind Noisy Students for Improved Image Quality Assessment	Apr 27, 2022	Image Quality AssessmentImage Restoration	CodeCode Available	1
CTC-based Non-autoregressive Textless Speech-to-Speech Translation	Jun 11, 2024	Knowledge DistillationMachine Translation	CodeCode Available	1
Curriculum Temperature for Knowledge Distillation	Nov 29, 2022	Image ClassificationKnowledge Distillation	CodeCode Available	1
Dark Experience for General Continual Learning: a Strong, Simple Baseline	Apr 15, 2020	class-incremental learningClass Incremental Learning	CodeCode Available	1
DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation	Oct 11, 2023	Decoderfr-en	CodeCode Available	1
DASS: Distilled Audio State Space Models Are Stronger and More Duration-Scalable Learners	Jul 4, 2024	Audio ClassificationAudio Tagging	CodeCode Available	1
A framework for benchmarking class-out-of-distribution detection and its application to ImageNet	Feb 23, 2023	BenchmarkingKnowledge Distillation	CodeCode Available	1
A Symmetric Dual Encoding Dense Retrieval Framework for Knowledge-Intensive Visual Question Answering	Apr 26, 2023	DecoderKnowledge Distillation	CodeCode Available	1
Continual Learning for LiDAR Semantic Segmentation: Class-Incremental and Coarse-to-Fine strategies on Sparse Data	Apr 8, 2023	class-incremental learningClass Incremental Learning	CodeCode Available	1
Data-Free Knowledge Distillation via Feature Exchange and Activation Region Constraint	Jan 1, 2023	Data AugmentationData-free Knowledge Distillation	CodeCode Available	1
Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks	Mar 25, 2022	Incremental LearningKnowledge Distillation	CodeCode Available	1
Decomposed Knowledge Distillation for Class-Incremental Semantic Segmentation	Oct 12, 2022	Class-Incremental Semantic SegmentationKnowledge Distillation	CodeCode Available	1
Decoupled Kullback-Leibler Divergence Loss	May 23, 2023	Adversarial DefenseAdversarial Robustness	CodeCode Available	1
Decoupled Multimodal Distilling for Emotion Recognition	Mar 24, 2023	Emotion RecognitionKnowledge Distillation	CodeCode Available	1
AgeFlow: Conditional Age Progression and Regression with Normalizing Flows	May 15, 2021	AttributeKnowledge Distillation	CodeCode Available	1
DeepKD: A Deeply Decoupled and Denoised Knowledge Distillation Trainer	May 21, 2025	DenoisingKnowledge Distillation	CodeCode Available	1
ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via α-β-Divergence	May 7, 2025	Knowledge Distillation	CodeCode Available	1
Deep Structured Instance Graph for Distilling Object Detectors	Sep 27, 2021	Instance SegmentationKnowledge Distillation	CodeCode Available	1
A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone	May 19, 2025	Knowledge DistillationTransfer Learning	CodeCode Available	1
Deliberated Domain Bridging for Domain Adaptive Semantic Segmentation	Sep 16, 2022	Domain AdaptationImage-to-Image Translation	CodeCode Available	1
Dense Interspecies Face Embedding	Nov 28, 2022	Image ManipulationInterspecies Facial Keypoint Transfer	CodeCode Available	1
Coaching a Teachable Student	Jun 16, 2023	CARLA longest6Knowledge Distillation	CodeCode Available	1

Show:10 25 50

← PrevPage 14 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified