Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3076–3100 of 4240 papers

Title	Date	Tasks	Status
Knowledge Cross-Distillation for Membership Privacy	Nov 2, 2021	Inference AttackKnowledge Distillation	—Unverified
The LMU Munich System for the WMT 2021 Large-Scale Multilingual Machine Translation Shared Task	Nov 1, 2021	Data AugmentationKnowledge Distillation	—Unverified
The NiuTrans System for the WMT 2021 Efficiency Task	Nov 1, 2021	GPUKnowledge Distillation	—Unverified
NVIDIA NeMo’s Neural Machine Translation Systems for English-German and English-Russian News and Biomedical Tasks at WMT21	Nov 1, 2021	Data AugmentationKnowledge Distillation	—Unverified
Papago’s Submission for the WMT21 Quality Estimation Shared Task	Nov 1, 2021	Knowledge DistillationMulti-Task Learning	—Unverified
The Mininglamp Machine Translation System for WMT21	Nov 1, 2021	Knowledge DistillationMachine Translation	—Unverified
HW-TSC’s Participation in the WMT 2021 News Translation Shared Task	Nov 1, 2021	de-enKnowledge Distillation	—Unverified
HW-TSC’s Participation in the WMT 2021 Large-Scale Multilingual Translation Task	Nov 1, 2021	Knowledge DistillationTranslation	—Unverified
TenTrans Large-Scale Multilingual Machine Translation System for WMT21	Nov 1, 2021	Knowledge DistillationMachine Translation	—Unverified
Efficient Machine Translation with Model Pruning and Quantization	Nov 1, 2021	CPUDecoder	—Unverified
AUTOSUMM: Automatic Model Creation for Text Summarization	Nov 1, 2021	Abstractive Text SummarizationDeep Learning	—Unverified
Students Who Study Together Learn Better: On the Importance of Collective Knowledge Distillation for Domain Transfer in Fact Verification	Nov 1, 2021	Fact VerificationKnowledge Distillation	—Unverified
Universal-KD: Attention-based Output-Grounded Intermediate Layer Knowledge Distillation	Nov 1, 2021	Knowledge Distillation	—Unverified
Exploring Non-Autoregressive Text Style Transfer	Nov 1, 2021	Contrastive LearningKnowledge Distillation	CodeCode Available
Collaborative Learning of Bidirectional Decoders for Unsupervised Text Style Transfer	Nov 1, 2021	AttributeDecoder	CodeCode Available
deepQuest-py: Large and Distilled Models for Quality Estimation	Nov 1, 2021	Knowledge DistillationSentence	CodeCode Available
PDALN: Progressive Domain Adaptation over a Pre-trained Model for Low-Resource Cross-Domain Named Entity Recognition	Nov 1, 2021	Cross-Domain Named Entity RecognitionData Augmentation	—Unverified
Domain-Lifelong Learning for Dialogue State Tracking via Knowledge Preservation Networks	Nov 1, 2021	Dialogue State TrackingDiversity	CodeCode Available
GAML-BERT: Improving BERT Early Exiting by Gradient Aligned Mutual Learning	Nov 1, 2021	Knowledge Distillation	—Unverified
Improving Stance Detection with Multi-Dataset Learning and Knowledge Distillation	Nov 1, 2021	Knowledge DistillationStance Detection	CodeCode Available
Mutual-Learning Improves End-to-End Speech Translation	Nov 1, 2021	Knowledge DistillationMachine Translation	—Unverified
RW-KD: Sample-wise Loss Terms Re-Weighting for Knowledge Distillation	Nov 1, 2021	Knowledge Distillation	—Unverified
Combining Curriculum Learning and Knowledge Distillation for Dialogue Generation	Nov 1, 2021	Dialogue GenerationKnowledge Distillation	—Unverified
Distilling Knowledge for Empathy Detection	Nov 1, 2021	Knowledge Distillation	CodeCode Available
Multilingual Neural Machine Translation: Can Linguistic Hierarchies Help?	Nov 1, 2021	Knowledge DistillationMachine Translation	—Unverified

Show:10 25 50

← PrevPage 124 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified