Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3051–3075 of 4240 papers

Title	Date	Tasks	Status	Hype
One General Teacher for Multi-Data Multi-Task: A New Knowledge Distillation Framework for Discourse Relation Analysis	Nov 16, 2021	Knowledge DistillationMulti-Task Learning	—Unverified	0
Self-Distilled Pruning of Neural Networks	Nov 16, 2021	Knowledge DistillationLanguage Modeling	—Unverified	0
Making Small Language Models Better Few-Shot Learners	Nov 16, 2021	Few-Shot LearningKnowledge Distillation	—Unverified	0
Feature Structure Distillation for BERT Transferring	Nov 16, 2021	Knowledge Distillation	—Unverified	0
Multi-stage Distillation Framework for Cross-Lingual Semantic Similarity Matching	Nov 16, 2021	Contrastive LearningKnowledge Distillation	—Unverified	0
Learning to Teach with Student Feedback	Nov 16, 2021	Knowledge Distillation	—Unverified	0
Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm	Nov 16, 2021	Knowledge Distillation	—Unverified	0
Aligned Weight Regularizers for Pruning Pretrained Neural Networks	Nov 16, 2021	Knowledge DistillationLanguage Modeling	—Unverified	0
NVIDIA NeMo Neural Machine Translation Systems for English-German and English-Russian News and Biomedical Tasks at WMT21	Nov 16, 2021	Data AugmentationKnowledge Distillation	—Unverified	0
Synthetic Unknown Class Learning for Learning Unknowns	Nov 15, 2021	DiversityKnowledge Distillation	—Unverified	0
Robust and Accurate Object Detection via Self-Knowledge Distillation	Nov 14, 2021	Adversarial RobustnessKnowledge Distillation	CodeCode Available	0
Facial Landmark Points Detection Using Knowledge Distillation-Based Neural Networks	Nov 13, 2021	Face AlignmentFacial Landmark Detection	CodeCode Available	0
Learning Interpretation with Explainable Knowledge Distillation	Nov 12, 2021	Knowledge DistillationModel Compression	—Unverified	0
Domain Generalization on Efficient Acoustic Scene Classification using Residual Normalization	Nov 12, 2021	Acoustic Scene ClassificationClassification	—Unverified	0
Incremental Meta-Learning via Episodic Replay Distillation for Few-Shot Image Recognition	Nov 9, 2021	Continual LearningKnowledge Distillation	CodeCode Available	0
On Representation Knowledge Distillation for Graph Neural Networks	Nov 9, 2021	Contrastive LearningKnowledge Distillation	CodeCode Available	1
A Survey on Green Deep Learning	Nov 8, 2021	Deep LearningKnowledge Distillation	—Unverified	0
Class Token and Knowledge Distillation for Multi-head Self-Attention Speaker Verification Systems	Nov 6, 2021	Knowledge DistillationPhilosophy	—Unverified	0
Oracle Teacher: Leveraging Target Information for Better Knowledge Distillation of CTC Models	Nov 5, 2021	Knowledge DistillationMachine Translation	—Unverified	0
Visualizing the Emergence of Intermediate Visual Patterns in DNNs	Nov 5, 2021	Knowledge Distillation	—Unverified	0
DVFL: A Vertical Federated Learning Method for Dynamic Data	Nov 5, 2021	Federated LearningKnowledge Distillation	—Unverified	0
AUTOKD: Automatic Knowledge Distillation Into A Student Architecture Family	Nov 5, 2021	Bayesian OptimizationKnowledge Distillation	—Unverified	0
A methodology for training homomorphicencryption friendly neural networks	Nov 5, 2021	Knowledge DistillationPrivacy Preserving	—Unverified	0
Leveraging Advantages of Interactive and Non-Interactive Models for Vector-Based Cross-Lingual Information Retrieval	Nov 3, 2021	Computational EfficiencyCross-Lingual Information Retrieval	—Unverified	0
LTD: Low Temperature Distillation for Robust Adversarial Training	Nov 3, 2021	Knowledge Distillation	—Unverified	0

Show:10 25 50

← PrevPage 123 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified