Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3651–3700 of 4240 papers

Title	Date	Tasks	Status
Topic Modeling for Maternal Health Using Reddit	Apr 1, 2021	Knowledge Distillation	—Unverified
Decentralized and Model-Free Federated Learning: Consensus-Based Distillation in Function Space	Apr 1, 2021	Federated LearningKnowledge Distillation	—Unverified
Unsupervised Domain Expansion for Visual Categorization	Apr 1, 2021	Domain AdaptationKnowledge Distillation	CodeCode Available
Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study	Apr 1, 2021	image-classificationImage Classification	—Unverified
Fixing the Teacher-Student Knowledge Discrepancy in Distillation	Mar 31, 2021	image-classificationImage Classification	—Unverified
Knowledge Distillation By Sparse Representation Matching	Mar 31, 2021	Knowledge DistillationRepresentation Learning	CodeCode Available
Industry Scale Semi-Supervised Learning for Natural Language Understanding	Mar 29, 2021	intent-classificationIntent Classification	—Unverified
Distilling Virtual Examples for Long-tailed Recognition	Mar 28, 2021	Knowledge DistillationLong-tail Learning	CodeCode Available
KnowRU: Knowledge Reusing via Knowledge Distillation in Multi-agent Reinforcement Learning	Mar 27, 2021	Deep Reinforcement LearningKnowledge Distillation	—Unverified
Weakly-Supervised Domain Adaptation of Deep Regression Trackers via Reinforced Knowledge Distillation	Mar 26, 2021	Domain AdaptationKnowledge Distillation	—Unverified
Hands-on Guidance for Distilling Object Detectors	Mar 26, 2021	Knowledge DistillationObject	—Unverified
Leaning Compact and Representative Features for Cross-Modality Person Re-Identification	Mar 26, 2021	Cross-Modality Person Re-identificationKnowledge Distillation	CodeCode Available
A Practical Survey on Faster and Lighter Transformers	Mar 26, 2021	Knowledge DistillationSurvey	—Unverified
Spirit Distillation: Precise Real-time Semantic Segmentation of Road Scenes with Insufficient Data	Mar 25, 2021	Autonomous DrivingFew-Shot Learning	—Unverified
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures	Mar 23, 2021	Information RetrievalKnowledge Distillation	—Unverified
Student Network Learning via Evolutionary Knowledge Distillation	Mar 23, 2021	Knowledge DistillationTransfer Learning	—Unverified
Balanced softmax cross-entropy for incremental learning with and without memory	Mar 23, 2021	class-incremental learningClass Incremental Learning	—Unverified
Compacting Deep Neural Networks for Internet of Things: Methods and Applications	Mar 20, 2021	DiversityKnowledge Distillation	—Unverified
Online Lifelong Generalized Zero-Shot Learning	Mar 19, 2021	Continual LearningGeneralized Zero-Shot Learning	CodeCode Available
Variational Knowledge Distillation for Disease Classification in Chest X-Rays	Mar 19, 2021	ClassificationGeneral Classification	—Unverified
Cost-effective Deployment of BERT Models in Serverless Environment	Mar 19, 2021	Knowledge DistillationSemantic Textual Similarity	—Unverified
Similarity Transfer for Knowledge Distillation	Mar 18, 2021	Knowledge Distillation	—Unverified
Transformer-based ASR Incorporating Time-reduction Layer and Fine-tuning with Self-Knowledge Distillation	Mar 17, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Leveraging Recent Advances in Deep Learning for Audio-Visual Emotion Recognition	Mar 16, 2021	Deep LearningEmotion Recognition	—Unverified
Robustly Optimized and Distilled Training for Natural Language Understanding	Mar 16, 2021	Knowledge DistillationMachine Reading Comprehension	—Unverified
Robust Model Compression Using Deep Hypotheses	Mar 13, 2021	Binary ClassificationKnowledge Distillation	CodeCode Available
A New Training Framework for Deep Neural Network	Mar 12, 2021	Knowledge Distillation	—Unverified
Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning	Mar 6, 2021	class-incremental learningClass Incremental Learning	—Unverified
Deep Neural Network Models Compression	Mar 4, 2021	Knowledge DistillationQuantization	—Unverified
Feature-Align Network with Knowledge Distillation for Efficient Denoising	Mar 2, 2021	DecoderDenoising	—Unverified
Embedded Knowledge Distillation in Depth-Level Dynamic Neural Network	Mar 1, 2021	Dynamic neural networksKnowledge Distillation	—Unverified
Alignment Knowledge Distillation for Online Streaming Attention-based Speech Recognition	Feb 28, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
PURSUhInT: In Search of Informative Hint Points Based on Layer Clustering for Knowledge Distillation	Feb 26, 2021	ClusteringKnowledge Distillation	—Unverified
Knowledge Distillation Circumvents Nonlinearity for Optical Convolutional Neural Networks	Feb 26, 2021	Computational EfficiencyKnowledge Distillation	—Unverified
Enhancing Data-Free Adversarial Distillation with Activation Regularization and Virtual Interpolation	Feb 23, 2021	Knowledge Distillation	—Unverified
Multi-View Feature Representation for Dialogue Generation with Bidirectional Distillation	Feb 22, 2021	Dialogue GenerationGeneral Knowledge	—Unverified
Exploring Knowledge Distillation of a Deep Neural Network for Multi-Script identification	Feb 20, 2021	Knowledge DistillationTransfer Learning	—Unverified
Hierarchical Transformer-based Large-Context End-to-end ASR with Large-Context Knowledge Distillation	Feb 16, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
End-to-End Automatic Speech Recognition with Deep Mutual Learning	Feb 16, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
CAP-GAN: Towards Adversarial Robustness with Cycle-consistent Attentional Purification	Feb 15, 2021	Adversarial AttackAdversarial Robustness	—Unverified
Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and language Models for Intent Classification	Feb 15, 2021	ClassificationGeneral Classification	—Unverified
Improved Customer Transaction Classification using Semi-Supervised Knowledge Distillation	Feb 15, 2021	ClassificationGeneral Classification	—Unverified
Self Regulated Learning Mechanism for Data Efficient Knowledge Distillation	Feb 14, 2021	Knowledge DistillationTransfer Learning	—Unverified
Semantically-Conditioned Negative Samples for Efficient Contrastive Learning	Feb 12, 2021	Contrastive LearningKnowledge Distillation	—Unverified
Learning Student-Friendly Teacher Networks for Knowledge Distillation	Feb 12, 2021	Knowledge DistillationTransfer Learning	—Unverified
NewsBERT: Distilling Pre-trained Language Model for Intelligent News Application	Feb 9, 2021	ArticlesKnowledge Distillation	—Unverified
Do Not Forget to Attend to Uncertainty while Mitigating Catastrophic Forgetting	Feb 3, 2021	Deep LearningIncremental Learning	—Unverified
Evolutionary Generative Adversarial Networks with Crossover Based Knowledge Distillation	Jan 27, 2021	Knowledge Distillation	CodeCode Available
ISP Distillation	Jan 25, 2021	Knowledge DistillationObject Recognition	—Unverified
Network-Agnostic Knowledge Transfer for Medical Image Segmentation	Jan 23, 2021	Image SegmentationKnowledge Distillation	—Unverified

Show:10 25 50

← PrevPage 74 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified