Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3451–3500 of 4240 papers

Title	Date	Tasks	Status	Hype
Class-Balanced Distillation for Long-Tailed Visual Recognition	Apr 12, 2021	Image ClassificationKnowledge Distillation	CodeCode Available	1
Dual Discriminator Adversarial Distillation for Data-free Model Compression	Apr 12, 2021	Data-free Knowledge DistillationKnowledge Distillation	—Unverified	0
Data-Free Knowledge Distillation with Soft Targeted Transfer Set Synthesis	Apr 10, 2021	Data-free Knowledge DistillationKnowledge Distillation	—Unverified	0
Towards Enabling Meta-Learning from Target Models	Apr 8, 2021	Few-Shot LearningInductive Bias	CodeCode Available	0
GKD: Semi-supervised Graph Knowledge Distillation for Graph-Independent Inference	Apr 8, 2021	Disease Predictiongraph construction	CodeCode Available	0
Distilling and Transferring Knowledge via cGAN-generated Samples for Image Classification and Regression	Apr 7, 2021	General Classificationimage-classification	CodeCode Available	0
Content-Aware GAN Compression	Apr 6, 2021	Image GenerationImage Manipulation	CodeCode Available	1
Compressing Visual-linguistic Model via Knowledge Distillation	Apr 5, 2021	Image CaptioningKnowledge Distillation	—Unverified	0
Knowledge Distillation For Wireless Edge Learning	Apr 3, 2021	Cloud ComputingFederated Learning	CodeCode Available	0
Topic Modeling for Maternal Health Using Reddit	Apr 1, 2021	Knowledge Distillation	—Unverified	0
Dialect Identification through Adversarial Learning and Knowledge Distillation on Romanian BERT	Apr 1, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Decentralized and Model-Free Federated Learning: Consensus-Based Distillation in Function Space	Apr 1, 2021	Federated LearningKnowledge Distillation	—Unverified	0
Unsupervised Domain Expansion for Visual Categorization	Apr 1, 2021	Domain AdaptationKnowledge Distillation	CodeCode Available	0
Students are the Best Teacher: Exit-Ensemble Distillation with Multi-Exits	Apr 1, 2021	ClassificationGeneral Classification	CodeCode Available	0
Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study	Apr 1, 2021	image-classificationImage Classification	—Unverified	0
Knowledge Distillation By Sparse Representation Matching	Mar 31, 2021	Knowledge DistillationRepresentation Learning	CodeCode Available	0
Fixing the Teacher-Student Knowledge Discrepancy in Distillation	Mar 31, 2021	image-classificationImage Classification	—Unverified	0
HAD-Net: A Hierarchical Adversarial Knowledge Distillation Network for Improved Enhanced Tumour Segmentation Without Post-Contrast Images	Mar 30, 2021	Knowledge DistillationSegmentation	CodeCode Available	1
Complementary Relation Contrastive Distillation	Mar 29, 2021	Knowledge DistillationRelation	CodeCode Available	1
Industry Scale Semi-Supervised Learning for Natural Language Understanding	Mar 29, 2021	intent-classificationIntent Classification	—Unverified	0
Distilling Virtual Examples for Long-tailed Recognition	Mar 28, 2021	Knowledge DistillationLong-tail Learning	CodeCode Available	0
Embedding Transfer with Label Relaxation for Improved Metric Learning	Mar 27, 2021	Knowledge DistillationMetric Learning	CodeCode Available	1
KnowRU: Knowledge Reusing via Knowledge Distillation in Multi-agent Reinforcement Learning	Mar 27, 2021	Deep Reinforcement LearningKnowledge Distillation	—Unverified	0
Distilling a Powerful Student Model via Online Knowledge Distillation	Mar 26, 2021	Knowledge Distillation	CodeCode Available	1
Multimodal Knowledge Expansion	Mar 26, 2021	DenoisingKnowledge Distillation	CodeCode Available	1
A Practical Survey on Faster and Lighter Transformers	Mar 26, 2021	Knowledge DistillationSurvey	—Unverified	0
Distilling Object Detectors via Decoupled Features	Mar 26, 2021	image-classificationImage Classification	CodeCode Available	1
Hands-on Guidance for Distilling Object Detectors	Mar 26, 2021	Knowledge DistillationObject	—Unverified	0
Leaning Compact and Representative Features for Cross-Modality Person Re-Identification	Mar 26, 2021	Cross-Modality Person Re-identificationKnowledge Distillation	CodeCode Available	0
Weakly-Supervised Domain Adaptation of Deep Regression Trackers via Reinforced Knowledge Distillation	Mar 26, 2021	Domain AdaptationKnowledge Distillation	—Unverified	0
Pruning-then-Expanding Model for Domain Adaptation of Neural Machine Translation	Mar 25, 2021	Domain AdaptationKnowledge Distillation	CodeCode Available	1
Spirit Distillation: Precise Real-time Semantic Segmentation of Road Scenes with Insufficient Data	Mar 25, 2021	Autonomous DrivingFew-Shot Learning	—Unverified	0
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures	Mar 23, 2021	Information RetrievalKnowledge Distillation	—Unverified	0
Student Network Learning via Evolutionary Knowledge Distillation	Mar 23, 2021	Knowledge DistillationTransfer Learning	—Unverified	0
Balanced softmax cross-entropy for incremental learning with and without memory	Mar 23, 2021	class-incremental learningClass Incremental Learning	—Unverified	0
ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques	Mar 21, 2021	Knowledge Distillation	CodeCode Available	1
Compacting Deep Neural Networks for Internet of Things: Methods and Applications	Mar 20, 2021	DiversityKnowledge Distillation	—Unverified	0
Variational Knowledge Distillation for Disease Classification in Chest X-Rays	Mar 19, 2021	ClassificationGeneral Classification	—Unverified	0
Online Lifelong Generalized Zero-Shot Learning	Mar 19, 2021	Continual LearningGeneralized Zero-Shot Learning	CodeCode Available	0
Cost-effective Deployment of BERT Models in Serverless Environment	Mar 19, 2021	Knowledge DistillationSemantic Textual Similarity	—Unverified	0
Self-Supervised Adaptation for Video Super-Resolution	Mar 18, 2021	Image Super-ResolutionKnowledge Distillation	CodeCode Available	1
Human-Inspired Multi-Agent Navigation using Knowledge Distillation	Mar 18, 2021	Collision AvoidanceKnowledge Distillation	CodeCode Available	1
Similarity Transfer for Knowledge Distillation	Mar 18, 2021	Knowledge Distillation	—Unverified	0
Transformer-based ASR Incorporating Time-reduction Layer and Fine-tuning with Self-Knowledge Distillation	Mar 17, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Leveraging Recent Advances in Deep Learning for Audio-Visual Emotion Recognition	Mar 16, 2021	Deep LearningEmotion Recognition	—Unverified	0
Robustly Optimized and Distilled Training for Natural Language Understanding	Mar 16, 2021	Knowledge DistillationMachine Reading Comprehension	—Unverified	0
Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation	Mar 15, 2021	Data AugmentationKnowledge Distillation	CodeCode Available	1
Robust Model Compression Using Deep Hypotheses	Mar 13, 2021	Binary ClassificationKnowledge Distillation	CodeCode Available	0
A New Training Framework for Deep Neural Network	Mar 12, 2021	Knowledge Distillation	—Unverified	0
Beyond Self-Supervision: A Simple Yet Effective Network Distillation Alternative to Improve Backbones	Mar 10, 2021	Knowledge Distillationobject-detection	CodeCode Available	1

Show:10 25 50

← PrevPage 70 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified