Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 626–650 of 4240 papers

Title	Date	Tasks	Status	Hype
It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher	Mar 31, 2022	AllData Free Quantization	CodeCode Available	1
Self-Distillation from the Last Mini-Batch for Consistency Regularization	Mar 30, 2022	Knowledge Distillation	CodeCode Available	1
Rainbow Keywords: Efficient Incremental Learning for Online Spoken Keyword Spotting	Mar 30, 2022	Data AugmentationDiversity	CodeCode Available	1
Monitored Distillation for Positive Congruent Depth Completion	Mar 30, 2022	Depth CompletionImage Reconstruction	CodeCode Available	1
Instance Relation Graph Guided Source-Free Domain Adaptive Object Detection	Mar 29, 2022	Domain AdaptationKnowledge Distillation	CodeCode Available	1
Uncertainty-aware Contrastive Distillation for Incremental Semantic Segmentation	Mar 26, 2022	Contrastive Learningimage-classification	CodeCode Available	1
Knowledge Distillation with the Reused Teacher Classifier	Mar 26, 2022	Knowledge Distillation	CodeCode Available	1
PCA-Based Knowledge Distillation Towards Lightweight and Content-Style Balanced Photorealistic Style Transfer Models	Mar 25, 2022	Knowledge DistillationStyle Transfer	CodeCode Available	1
Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks	Mar 25, 2022	Incremental LearningKnowledge Distillation	CodeCode Available	1
Rich Feature Construction for the Optimization-Generalization Dilemma	Mar 24, 2022	Inductive BiasKnowledge Distillation	CodeCode Available	1
Ensembling and Knowledge Distilling of Large Sequence Taggers for Grammatical Error Correction	Mar 24, 2022	Grammatical Error CorrectionKnowledge Distillation	CodeCode Available	1
R-DFCIL: Relation-Guided Representation Learning for Data-Free Class Incremental Learning	Mar 24, 2022	class-incremental learningClass Incremental Learning	CodeCode Available	1
SSD-KD: A Self-supervised Diverse Knowledge Distillation Method for Lightweight Skin Lesion Classification Using Dermoscopic Images	Mar 22, 2022	Knowledge DistillationLesion Classification	CodeCode Available	1
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization	Mar 21, 2022	Knowledge DistillationModel Compression	CodeCode Available	1
Document-Level Relation Extraction with Adaptive Focal Loss and Knowledge Distillation	Mar 21, 2022	Document-level Relation ExtractionKnowledge Distillation	CodeCode Available	1
Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation	Mar 20, 2022	Knowledge DistillationLanguage Modelling	CodeCode Available	1
Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning	Mar 17, 2022	Data-free Knowledge DistillationFederated Learning	CodeCode Available	1
When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation	Mar 17, 2022	Data AugmentationHellaSwag	CodeCode Available	1
Graph Flow: Cross-layer Graph Flow Distillation for Dual Efficient Medical Image Segmentation	Mar 16, 2022	Image SegmentationKnowledge Distillation	CodeCode Available	1
SATS: Self-Attention Transfer for Continual Semantic Segmentation	Mar 15, 2022	Continual Semantic SegmentationKnowledge Distillation	CodeCode Available	1
Unified Visual Transformer Compression	Mar 15, 2022	Knowledge Distillation	CodeCode Available	1
Representation Compensation Networks for Continual Semantic Segmentation	Mar 10, 2022	Class Incremental LearningContinual Learning	CodeCode Available	1
Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability	Mar 10, 2022	Knowledge Distillation	CodeCode Available	1
Prediction-Guided Distillation for Dense Object Detection	Mar 10, 2022	Dense Object DetectionKnowledge Distillation	CodeCode Available	1
Overcoming Catastrophic Forgetting beyond Continual Learning: Balanced Training for Neural Machine Translation	Mar 8, 2022	Continual LearningKnowledge Distillation	CodeCode Available	1

Show:10 25 50

← PrevPage 26 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified