Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 401–425 of 4240 papers

Title	Date	Tasks	Status	Hype	Score
Backdoor Attacks on Self-Supervised Learning	May 21, 2021	Backdoor AttackInductive Bias	CodeCode Available	1	5
Backdoor Cleansing with Unlabeled Data	Nov 22, 2022	Knowledge Distillation	CodeCode Available	1	5
DARTS: Double Attention Reference-based Transformer for Super-resolution	Jul 17, 2023	Image Super-ResolutionKnowledge Distillation	CodeCode Available	1	5
Faster ILOD: Incremental Learning for Object Detectors based on Faster RCNN	Mar 9, 2020	Incremental LearningKnowledge Distillation	CodeCode Available	1	5
Dark Experience for General Continual Learning: a Strong, Simple Baseline	Apr 15, 2020	class-incremental learningClass Incremental Learning	CodeCode Available	1	5
Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells	Oct 25, 2018	Depth EstimationDepth Prediction	CodeCode Available	1	5
Distilling the Knowledge of BERT for Sequence-to-Sequence ASR	Aug 9, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1	5
Balanced Knowledge Distillation for Long-tailed Learning	Apr 21, 2021	Knowledge Distillation	CodeCode Available	1	5
Data Diversification: A Simple Strategy For Neural Machine Translation	Nov 5, 2019	Knowledge DistillationMachine Translation	CodeCode Available	1	5
DASS: Distilled Audio State Space Models Are Stronger and More Duration-Scalable Learners	Jul 4, 2024	Audio ClassificationAudio Tagging	CodeCode Available	1	5
f-Divergence Minimization for Sequence-Level Knowledge Distillation	Jul 27, 2023	Knowledge Distillation	CodeCode Available	1	5
Feature Structure Distillation with Centered Kernel Alignment in BERT Transferring	Apr 1, 2022	Knowledge DistillationLanguage Modeling	CodeCode Available	1	5
Aligned Structured Sparsity Learning for Efficient Image Super-Resolution	Dec 1, 2021	Image Super-ResolutionKnowledge Distillation	CodeCode Available	1	5
Data-Free Class-Incremental Hand Gesture Recognition	Jan 1, 2023	class-incremental learningClass Incremental Learning	CodeCode Available	1	5
CaMEL: Mean Teacher Learning for Image Captioning	Feb 21, 2022	Image CaptioningKnowledge Distillation	CodeCode Available	1	5
Data-Free Network Quantization With Adversarial Knowledge Distillation	May 8, 2020	Knowledge DistillationModel Compression	CodeCode Available	1	5
Data-Free Knowledge Distillation via Feature Exchange and Activation Region Constraint	Jan 1, 2023	Data AugmentationData-free Knowledge Distillation	CodeCode Available	1	5
BearingPGA-Net: A Lightweight and Deployable Bearing Fault Diagnosis Network via Decoupled Knowledge Distillation and FPGA Acceleration	Jul 31, 2023	CPUFault Diagnosis	CodeCode Available	1	5
DA-Mamba: Domain Adaptive Hybrid Mamba-Transformer Based One-Stage Object Detection	Feb 16, 2025	Domain AdaptationKnowledge Distillation	CodeCode Available	1	5
FedMD: Heterogenous Federated Learning via Model Distillation	Oct 8, 2019	Federated LearningKnowledge Distillation	CodeCode Available	1	5
FedSOL: Stabilized Orthogonal Learning with Proximal Restrictions in Federated Learning	Aug 24, 2023	Continual LearningFederated Learning	CodeCode Available	1	5
FedUKD: Federated UNet Model with Knowledge Distillation for Land Use Classification from Satellite and Street Views	Dec 5, 2022	Knowledge DistillationModel Compression	CodeCode Available	1	5
Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Model	Dec 2, 2024	cross-modal alignmentKnowledge Distillation	CodeCode Available	1	5
Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning	Mar 17, 2022	Data-free Knowledge DistillationFederated Learning	CodeCode Available	1	5
3D Annotation-Free Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving	May 24, 2024	Autonomous DrivingKnowledge Distillation	CodeCode Available	1	5

Show:10 25 50

← PrevPage 17 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified