Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3351–3375 of 4240 papers

Title	Date	Tasks	Status	Hype
LENAS: Learning-based Neural Architecture Search and Ensemble for 3D Radiotherapy Dose Prediction	Jun 12, 2021	DiversityEnsemble Learning	CodeCode Available	0
RefBERT: Compressing BERT by Referencing to Pre-computed Representations	Jun 11, 2021	Knowledge Distillation	—Unverified	0
Generate, Annotate, and Learn: NLP with Synthetic Text	Jun 11, 2021	Few-Shot LearningImage Classification	CodeCode Available	0
Does Knowledge Distillation Really Work?	Jun 10, 2021	Knowledge Distillation	CodeCode Available	1
Marginal Utility Diminishes: Exploring the Minimum Knowledge for BERT Knowledge Distillation	Jun 10, 2021	Knowledge Distillation	CodeCode Available	0
AKE-GNN: Effective Graph Learning with Adaptive Knowledge Exchange	Jun 10, 2021	ClassificationGraph Classification	—Unverified	0
Knowledge distillation: A good teacher is patient and consistent	Jun 9, 2021	Image ClassificationKnowledge Distillation	CodeCode Available	2
Distilling Image Classifiers in Object Detectors	Jun 9, 2021	Knowledge DistillationObject	CodeCode Available	1
XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation	Jun 8, 2021	Knowledge DistillationNER	CodeCode Available	1
Learning by Distillation: A Self-Supervised Learning Framework for Optical Flow Estimation	Jun 8, 2021	Knowledge DistillationOptical Flow Estimation	—Unverified	0
BERT Learns to Teach: Knowledge Distillation with Meta Learning	Jun 8, 2021	Knowledge DistillationMeta-Learning	CodeCode Available	1
RoSearch: Search for Robust Student Architectures When Distilling Pre-trained Language Models	Jun 7, 2021	Adversarial RobustnessKnowledge Distillation	—Unverified	0
Zero-Shot Knowledge Distillation from a Decision-Based Black-Box Model	Jun 7, 2021	Knowledge Distillation	CodeCode Available	1
Preservation of the Global Knowledge by Not-True Distillation in Federated Learning	Jun 6, 2021	Continual LearningFederated Learning	CodeCode Available	1
Bidirectional Distillation for Top-K Recommender System	Jun 5, 2021	Knowledge DistillationModel Compression	CodeCode Available	1
MergeDistill: Merging Pre-trained Language Models using Distillation	Jun 5, 2021	Cross-Lingual TransferKnowledge Distillation	—Unverified	0
ERNIE-Tiny : A Progressive Distillation Framework for Pretrained Transformer Compression	Jun 4, 2021	Knowledge Distillation	CodeCode Available	0
Not All Knowledge Is Created Equal: Mutual Distillation of Confident Knowledge	Jun 2, 2021	AllKnowledge Distillation	—Unverified	0
Rejuvenating Low-Frequency Words: Making the Most of Parallel Data in Non-Autoregressive Translation	Jun 2, 2021	Knowledge DistillationTranslation	CodeCode Available	0
One Teacher is Enough? Pre-trained Language Model Distillation from Multiple Teachers	Jun 2, 2021	Knowledge DistillationLanguage Modeling	—Unverified	0
Modality-specific Distillation	Jun 1, 2021	Knowledge DistillationMeta-Learning	—Unverified	0
Cost-effective Deployment of BERT Models in Serverless Environment	Jun 1, 2021	Knowledge DistillationSemantic Textual Similarity	—Unverified	0
Continual Learning for Neural Machine Translation	Jun 1, 2021	Continual LearningKnowledge Distillation	—Unverified	0
Multi-Grained Knowledge Distillation for Named Entity Recognition	Jun 1, 2021	Knowledge Distillationnamed-entity-recognition	—Unverified	0
Towards Quantifiable Dialogue Coherence Evaluation	Jun 1, 2021	Coherence EvaluationDialogue Evaluation	CodeCode Available	1

Show:10 25 50

← PrevPage 135 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified