Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3751–3800 of 4240 papers

Title	Date	Tasks	Status
Large-Scale Generative Data-Free Distillation	Dec 10, 2020	Knowledge DistillationModel Compression	—Unverified
On Knowledge Distillation for Direct Speech Translation	Dec 9, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Model Compression Using Optimal Transport	Dec 7, 2020	image-classificationImage Classification	—Unverified
Parallel Blockwise Knowledge Distillation for Deep Neural Network Compression	Dec 5, 2020	Knowledge DistillationNeural Network Compression	CodeCode Available
Reciprocal Supervised Learning Improves Neural Machine Translation	Dec 5, 2020	image-classificationImage Classification	CodeCode Available
Multi-head Knowledge Distillation for Model Compression	Dec 5, 2020	image-classificationImage Classification	—Unverified
Meta-KD: A Meta Knowledge Distillation Framework for Language Model Compression across Domains	Dec 2, 2020	Knowledge DistillationLanguage Modeling	—Unverified
Self-Supervised Generative Adversarial Compression	Dec 1, 2020	image-classificationImage Classification	—Unverified
Solvable Model for Inheriting the Regularization through Knowledge Distillation	Dec 1, 2020	Knowledge DistillationTransfer Learning	—Unverified
Query Distillation: BERT-based Distillation for Ensemble Ranking	Dec 1, 2020	Knowledge Distillation	—Unverified
Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Evolvability	Dec 1, 2020	ClassificationFairness	—Unverified
Reverse-engineering recurrent neural network solutions to a hierarchical inference task for mice	Dec 1, 2020	Knowledge DistillationModel Compression	—Unverified
A Selective Survey on Versatile Knowledge Distillation Paradigm for Neural Network Models	Nov 30, 2020	Knowledge DistillationModel Compression	—Unverified
Real-time Spatio-temporal Action Localization via Learning Motion Representation	Nov 30, 2020	Action ClassificationAction Localization	—Unverified
Adaptive Multiplane Image Generation from a Single Internet Picture	Nov 26, 2020	Depth EstimationImage Generation	—Unverified
torchdistill: A Modular, Configuration-Driven Framework for Knowledge Distillation	Nov 25, 2020	Image ClassificationInstance Segmentation	—Unverified
Generative Adversarial Simulator	Nov 23, 2020	Data-free Knowledge DistillationKnowledge Distillation	—Unverified
MixMix: All You Need for Data-Free Compression Are Feature and Data Mixing	Nov 19, 2020	AllKnowledge Distillation	—Unverified
A Knowledge Distillation Ensemble Framework for Predicting Short and Long-term Hospitalisation Outcomes from Electronic Health Records Data	Nov 18, 2020	Decision MakingICU Admission	CodeCode Available
Privileged Knowledge Distillation for Online Action Detection	Nov 18, 2020	Action DetectionKnowledge Distillation	—Unverified
Effectiveness of Arbitrary Transfer Sets for Data-free Knowledge Distillation	Nov 18, 2020	Data-free Knowledge DistillationKnowledge Distillation	—Unverified
Generalized Continual Zero-Shot Learning	Nov 17, 2020	Continual LearningKnowledge Distillation	—Unverified
Deep Serial Number: Computational Watermarking for DNN Intellectual Property Protection	Nov 17, 2020	Knowledge Distillationvalid	—Unverified
Digging Deeper into CRNN Model in Chinese Text Images Recognition	Nov 17, 2020	DenoisingKnowledge Distillation	—Unverified
Online Ensemble Model Compression using Knowledge Distillation	Nov 15, 2020	Knowledge Distillationmodel	CodeCode Available
Real-Time Decentralized knowledge Transfer at the Edge	Nov 11, 2020	Knowledge DistillationTransfer Learning	CodeCode Available
EGAD: Evolving Graph Representation Learning with Self-Attention and Knowledge Distillation for Live Video Streaming Events	Nov 11, 2020	Graph Representation LearningKnowledge Distillation	CodeCode Available
Distill2Vec: Dynamic Graph Representation Learning with Knowledge Distillation	Nov 11, 2020	Graph Representation LearningKnowledge Distillation	CodeCode Available
On Estimating the Training Cost of Conversational Recommendation Systems	Nov 10, 2020	Conversational RecommendationKnowledge Distillation	—Unverified
Knowledge Distillation for Singing Voice Detection	Nov 9, 2020	Information RetrievalKnowledge Distillation	CodeCode Available
Ensemble Knowledge Distillation for CTR Prediction	Nov 8, 2020	Click-Through Rate PredictionKnowledge Distillation	—Unverified
Robustness and Diversity Seeking Data-Free Knowledge Distillation	Nov 7, 2020	Data-free Knowledge DistillationDiversity	CodeCode Available
Human-Like Active Learning: Machines Simulating the Human Learning Process	Nov 7, 2020	Active LearningForm	—Unverified
Channel Planting for Deep Neural Networks using Knowledge Distillation	Nov 4, 2020	Knowledge DistillationNetwork Pruning	—Unverified
On Self-Distilling Graph Neural Network	Nov 4, 2020	Graph EmbeddingGraph Neural Network	—Unverified
Paralinguistic Privacy Protection at the Edge	Nov 4, 2020	CPUKnowledge Distillation	—Unverified
A Comprehensive Study of Class Incremental Learning Algorithms for Visual Tasks	Nov 3, 2020	class-incremental learningClass Incremental Learning	—Unverified
Distilling Knowledge by Mimicking Features	Nov 3, 2020	Knowledge Distillationobject-detection	CodeCode Available
Learning to Maximize Speech Quality Directly Using MOS Prediction for Neural Text-to-Speech	Nov 2, 2020	Knowledge DistillationSpeech Synthesis	—Unverified
Data-free Knowledge Distillation for Segmentation using Data-Enriching GAN	Nov 2, 2020	Data-free Knowledge DistillationDiversity	CodeCode Available
The NiuTrans Machine Translation Systems for WMT20	Nov 1, 2020	Knowledge DistillationMachine Translation	—Unverified
IIE’s Neural Machine Translation Systems for WMT20	Nov 1, 2020	Domain AdaptationKnowledge Distillation	—Unverified
HW-TSC’s Participation in the WMT 2020 News Translation Shared Task	Nov 1, 2020	Knowledge DistillationTranslation	—Unverified
High Performance Natural Language Processing	Nov 1, 2020	Knowledge DistillationQuantization	—Unverified
Using the Past Knowledge to Improve Sentiment Classification	Nov 1, 2020	ClassificationKnowledge Distillation	—Unverified
Distilling Structured Knowledge for Text-Based Relational Reasoning	Nov 1, 2020	Contrastive LearningKnowledge Distillation	—Unverified
Fast End-to-end Coreference Resolution for Korean	Nov 1, 2020	coreference-resolutionCoreference Resolution	—Unverified
Bridging the Gap between Prior and Posterior Knowledge Selection for Knowledge-Grounded Dialogue Generation	Nov 1, 2020	DecoderDialogue Generation	—Unverified
FedED: Federated Learning via Ensemble Distillation for Medical Relation Extraction	Nov 1, 2020	Federated LearningKnowledge Distillation	—Unverified
MixKD: Towards Efficient Distillation of Large-scale Language Models	Nov 1, 2020	Data AugmentationKnowledge Distillation	—Unverified

Show:10 25 50

← PrevPage 76 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified