Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4001–4025 of 4240 papers

Title	Date	Tasks	Status	Hype
Understanding Knowledge Distillation in Non-autoregressive Machine Translation	Nov 7, 2019	Knowledge DistillationMachine Translation	—Unverified	0
Data Diversification: A Simple Strategy For Neural Machine Translation	Nov 5, 2019	Knowledge DistillationMachine Translation	CodeCode Available	1
ESPnet How2 Speech Translation System for IWSLT 2019: Pre-training, Knowledge Distillation, and Going Deeper	Nov 1, 2019	AllKnowledge Distillation	—Unverified	0
Weakly Supervised Cross-lingual Semantic Relation Classification via Knowledge Distillation	Nov 1, 2019	ClassificationCross-Lingual Transfer	—Unverified	0
Natural Language Generation for Effective Knowledge Distillation	Nov 1, 2019	Knowledge DistillationLinguistic Acceptability	CodeCode Available	0
Distilling Pixel-Wise Feature Similarities for Semantic Segmentation	Oct 31, 2019	Knowledge DistillationNeural Network Compression	—Unverified	0
A Simple but Effective BERT Model for Dialog State Tracking on Resource-Limited Systems	Oct 28, 2019	dialog state trackingDialogue State Tracking	—Unverified	0
MOD: A Deep Mixture Model with Online Knowledge Distillation for Large Scale Video Temporal Concept Localization	Oct 27, 2019	Knowledge DistillationVideo Understanding	CodeCode Available	0
Variational Student: Learning Compact and Sparser Networks in Knowledge Distillation Framework	Oct 26, 2019	Knowledge DistillationVariational Inference	—Unverified	0
Secost: Sequential co-supervision for large scale weakly labeled audio event detection	Oct 25, 2019	Event DetectionKnowledge Distillation	—Unverified	0
An Empirical Study of Efficient ASR Rescoring with Transformers	Oct 24, 2019	Knowledge DistillationLanguage Modeling	—Unverified	0
Adversarial Feature Alignment: Avoid Catastrophic Forgetting in Incremental Task Lifelong Learning	Oct 24, 2019	Continual Learningimage-classification	—Unverified	0
Contrastive Representation Distillation	Oct 23, 2019	Contrastive LearningKnowledge Distillation	CodeCode Available	1
Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System	Oct 18, 2019	General KnowledgeKnowledge Distillation	—Unverified	0
A Generalized and Robust Method Towards Practical Gaze Estimation on Smart Phone	Oct 16, 2019	Gaze EstimationKnowledge Distillation	—Unverified	0
Noise as a Resource for Learning in Knowledge Distillation	Oct 11, 2019	Knowledge Distillation	—Unverified	0
VarGFaceNet: An Efficient Variable Group Convolutional Neural Network for Lightweight Face Recognition	Oct 11, 2019	Face DetectionFace Identification	CodeCode Available	0
Cross-modal knowledge distillation for action recognition	Oct 10, 2019	Action RecognitionKnowledge Distillation	—Unverified	0
FedMD: Heterogenous Federated Learning via Model Distillation	Oct 8, 2019	Federated LearningKnowledge Distillation	CodeCode Available	1
Knowledge Distillation from Internal Representations	Oct 8, 2019	Knowledge Distillation	—Unverified	0
Distilling BERT into Simple Neural Networks with Unlabeled Transfer Data	Oct 4, 2019	Knowledge DistillationNER	—Unverified	0
On the Efficacy of Knowledge Distillation	Oct 3, 2019	Knowledge Distillation	—Unverified	0
Improving Word Embedding Factorization for Compression Using Distilled Nonlinear Neural Decomposition	Oct 2, 2019	Knowledge DistillationLanguage Modeling	—Unverified	0
AntMan: Sparse Low-Rank Compression to Accelerate RNN inference	Oct 2, 2019	Knowledge DistillationLow-rank compression	—Unverified	0
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	Oct 2, 2019	Hate Speech DetectionKnowledge Distillation	CodeCode Available	1

Show:10 25 50

← PrevPage 161 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified