Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3701–3750 of 4240 papers

Title	Date	Tasks	Status
Collaborative Teacher-Student Learning via Multiple Knowledge Transfer	Jan 21, 2021	Knowledge DistillationModel Compression	—Unverified
Bridging the gap between Human Action Recognition and Online Action Detection	Jan 21, 2021	Action DetectionAction Recognition	—Unverified
Deep Epidemiological Modeling by Black-box Knowledge Distillation: An Accurate Deep Learning Model for COVID-19	Jan 20, 2021	DiversityKnowledge Distillation	—Unverified
Learning to Augment for Data-Scarce Domain BERT Knowledge Distillation	Jan 20, 2021	Knowledge Distillation	—Unverified
Incremental Knowledge Based Question Answering	Jan 18, 2021	Incremental LearningKnowledge Distillation	—Unverified
Knowledge Distillation Methods for Efficient Unsupervised Adaptation Across Multiple Domains	Jan 18, 2021	Domain Adaptationimage-classification	—Unverified
Mining Data Impressions from Deep Models as Substitute for the Unavailable Training Data	Jan 15, 2021	Adversarial RobustnessContinual Learning	—Unverified
KDLSQ-BERT: A Quantized Bert Combining Knowledge Distillation with Learned Step Size Quantization	Jan 15, 2021	Knowledge DistillationLanguage Modelling	—Unverified
Interpretable discovery of new semiconductors with machine learning	Jan 12, 2021	BIG-bench Machine LearningKnowledge Distillation	—Unverified
Resolution-Based Distillation for Efficient Histology Image Classification	Jan 11, 2021	ClassificationComputational Efficiency	—Unverified
Adversarially Robust and Explainable Model Compression with On-Device Personalization for Text Classification	Jan 10, 2021	Adversarial RobustnessGeneral Classification	—Unverified
Spending Your Winning Lottery Better After Drawing It	Jan 8, 2021	Knowledge Distillation	CodeCode Available
MSD: Saliency-aware Knowledge Distillation for Multimodal Understanding	Jan 6, 2021	Knowledge DistillationMeta-Learning	—Unverified
Label Augmentation via Time-based Knowledge Distillation for Financial Anomaly Detection	Jan 5, 2021	Anomaly DetectionKnowledge Distillation	—Unverified
Knowledge distillation via softmax regression representation learning	Jan 1, 2021	Knowledge DistillationModel Compression	—Unverified
Robust Overfitting may be mitigated by properly learned smoothening	Jan 1, 2021	Knowledge Distillation	—Unverified
FLAR: A Unified Prototype Framework for Few-Sample Lifelong Active Recognition	Jan 1, 2021	Knowledge DistillationLifelong learning	—Unverified
Understanding Adversarial Attacks on Autoencoders	Jan 1, 2021	Compressive SensingKnowledge Distillation	—Unverified
Understanding Knowledge Distillation	Jan 1, 2021	Knowledge Distillation	—Unverified
Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning	Jan 1, 2021	class-incremental learningClass Incremental Learning	—Unverified
Disentanglement, Visualization and Analysis of Complex Features in DNNs	Jan 1, 2021	DisentanglementKnowledge Distillation	—Unverified
Unpaired Learning for Deep Image Deraining With Rain Direction Regularizer	Jan 1, 2021	Knowledge DistillationRain Removal	—Unverified
Explicit Connection Distillation	Jan 1, 2021	image-classificationImage Classification	—Unverified
Rethinking Soft Labels for Knowledge Distillation: A Bias–Variance Tradeoff Perspective	Jan 1, 2021	Knowledge Distillation	—Unverified
Student Customized Knowledge Distillation: Bridging the Gap Between Student and Teacher	Jan 1, 2021	image-classificationImage Classification	—Unverified
Learning from deep model via exploring local targets	Jan 1, 2021	Knowledge Distillationmodel	—Unverified
Improving De-Raining Generalization via Neural Reorganization	Jan 1, 2021	Knowledge Distillation	—Unverified
Can Students Outperform Teachers in Knowledge Distillation based Model Compression?	Jan 1, 2021	Knowledge DistillationModel Compression	—Unverified
Contextual Knowledge Distillation for Transformer Compression	Jan 1, 2021	Knowledge DistillationLanguage Modeling	—Unverified
Don't be picky, all students in the right family can learn from good teachers	Jan 1, 2021	AllBayesian Optimization	—Unverified
Active Learning for Lane Detection: A Knowledge Distillation Approach	Jan 1, 2021	2D Object DetectionActive Learning	—Unverified
Knowledge Distillation based Ensemble Learning for Neural Machine Translation	Jan 1, 2021	Ensemble LearningKnowledge Distillation	—Unverified
Distilling Global and Local Logits With Densely Connected Relations	Jan 1, 2021	image-classificationImage Classification	CodeCode Available
Kernel Methods in Hyperbolic Spaces	Jan 1, 2021	Few-Shot Learningimage-classification	—Unverified
Fully Synthetic Data Improves Neural Machine Translation with Knowledge Distillation	Dec 31, 2020	Knowledge DistillationMachine Translation	—Unverified
Towards Zero-Shot Knowledge Distillation for Natural Language Processing	Dec 31, 2020	Knowledge DistillationModel Compression	—Unverified
Knowledge Distillation with Adaptive Asymmetric Label Sharpening for Semi-supervised Fracture Detection in Chest X-rays	Dec 30, 2020	Fracture detectionKnowledge Distillation	—Unverified
Understanding and Improving Lexical Choice in Non-Autoregressive Translation	Dec 29, 2020	Knowledge DistillationTranslation	—Unverified
ALP-KD: Attention-Based Layer Projection for Knowledge Distillation	Dec 27, 2020	Knowledge Distillation	—Unverified
Towards a Universal Continuous Knowledge Base	Dec 25, 2020	Knowledge Distillationtext-classification	—Unverified
Future-Guided Incremental Transformer for Simultaneous Translation	Dec 23, 2020	Knowledge DistillationTranslation	—Unverified
AttentionLite: Towards Efficient Self-Attention Models for Vision	Dec 21, 2020	Knowledge Distillation	—Unverified
Diverse Knowledge Distillation for End-to-End Person Search	Dec 21, 2020	Human DetectionKnowledge Distillation	—Unverified
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning	Dec 17, 2020	Deep LearningKnowledge Distillation	—Unverified
Distilling Optimal Neural Networks: Rapid Search in Diverse Spaces	Dec 16, 2020	GPUKnowledge Distillation	—Unverified
Wasserstein Contrastive Representation Distillation	Dec 15, 2020	Contrastive LearningKnowledge Distillation	—Unverified
LRC-BERT: Latent-representation Contrastive Knowledge Distillation for Natural Language Understanding	Dec 14, 2020	Contrastive LearningKnowledge Distillation	—Unverified
Periocular Embedding Learning with Consistent Knowledge Distillation from Face	Dec 12, 2020	Knowledge DistillationPrediction	—Unverified
Improving Task-Agnostic BERT Distillation with Layer Mapping Search	Dec 11, 2020	Knowledge Distillation	—Unverified
Reinforced Multi-Teacher Selection for Knowledge Distillation	Dec 11, 2020	GPUKnowledge Distillation	—Unverified

Show:10 25 50

← PrevPage 75 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified