Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3901–3950 of 4240 papers

Title	Date	Tasks	Status	Hype
Towards Non-task-specific Distillation of BERT via Sentence Representation Approximation	Apr 7, 2020	Knowledge DistillationSentence	—Unverified	0
Enhancing Review Comprehension with Domain-Specific Commonsense	Apr 6, 2020	Aspect ExtractionKnowledge Distillation	—Unverified	0
Temporally Distributed Networks for Fast Video Semantic Segmentation	Apr 3, 2020	Knowledge DistillationReal-Time Semantic Segmentation	CodeCode Available	1
More Grounded Image Captioning by Distilling Image-Text Matching Model	Apr 1, 2020	Image CaptioningImage-text matching	CodeCode Available	1
Knowledge as Priors: Cross-Modal Knowledge Generalization for Datasets without Superior Knowledge	Apr 1, 2020	3D Hand Pose EstimationHand Pose Estimation	—Unverified	0
Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing	Apr 1, 2020	Knowledge DistillationRetrieval	CodeCode Available	1
Neural Networks Are More Productive Teachers Than Human Raters: Active Mixup for Data-Efficient Knowledge Distillation from a Blackbox Model	Mar 31, 2020	Active LearningKnowledge Distillation	CodeCode Available	1
Distilled Semantics for Comprehensive Scene Understanding from Videos	Mar 31, 2020	Depth EstimationKnowledge Distillation	CodeCode Available	1
Spatio-Temporal Graph for Video Captioning with Knowledge Distillation	Mar 31, 2020	Knowledge DistillationObject	—Unverified	0
SS-IL: Separated Softmax for Incremental Learning	Mar 31, 2020	class-incremental learningClass Incremental Learning	—Unverified	0
Regularizing Class-wise Predictions via Self-knowledge Distillation	Mar 31, 2020	image-classificationImage Classification	CodeCode Available	1
Squeezed Deep 6DoF Object Detection Using Knowledge Distillation	Mar 30, 2020	Knowledge DistillationObject	CodeCode Available	0
Analysis of Knowledge Transfer in Kernel Regime	Mar 30, 2020	Knowledge DistillationTransfer Learning	—Unverified	0
Circumventing Outliers of AutoAugment with Knowledge Distillation	Mar 25, 2020	Data AugmentationGeneral Classification	CodeCode Available	1
A Survey of Methods for Low-Power Deep Learning and Computer Vision	Mar 24, 2020	Knowledge DistillationQuantization	—Unverified	0
Synergic Adversarial Label Learning for Grading Retinal Diseases via Knowledge Distillation and Multi-task Learning	Mar 24, 2020	ClassificationGeneral Classification	—Unverified	0
Distilling Knowledge from Graph Convolutional Networks	Mar 23, 2020	Knowledge DistillationTransfer Learning	CodeCode Available	1
Collaborative Distillation for Ultra-Resolution Universal Style Transfer	Mar 18, 2020	DecoderGPU	CodeCode Available	1
Incremental Object Detection via Meta-Learning	Mar 17, 2020	Incremental LearningKnowledge Distillation	CodeCode Available	1
Teacher-Student chain for efficient semi-supervised histology image classification	Mar 17, 2020	ClassificationGeneral Classification	—Unverified	0
Deformation Flow Based Two-Stream Network for Lip Reading	Mar 12, 2020	Knowledge DistillationLipreading	CodeCode Available	1
SuperMix: Supervising the Mixing Data Augmentation	Mar 10, 2020	Data AugmentationGeneral Classification	CodeCode Available	1
Knowledge distillation via adaptive instance normalization	Mar 9, 2020	Knowledge DistillationModel Compression	—Unverified	0
Faster ILOD: Incremental Learning for Object Detectors based on Faster RCNN	Mar 9, 2020	Incremental LearningKnowledge Distillation	CodeCode Available	1
Pacemaker: Intermediate Teacher Knowledge Distillation For On-The-Fly Convolutional Neural Network	Mar 9, 2020	Knowledge DistillationModel Compression	—Unverified	0
PoseNet3D: Learning Temporally Consistent 3D Human Pose via Knowledge Distillation	Mar 7, 2020	3D Human Pose EstimationKnowledge Distillation	CodeCode Available	1
Distilling portable Generative Adversarial Networks for Image Translation	Mar 7, 2020	Image-to-Image TranslationKnowledge Distillation	—Unverified	0
Explaining Knowledge Distillation by Quantifying the Knowledge	Mar 7, 2020	Knowledge Distillation	—Unverified	0
Distill, Adapt, Distill: Training Small, In-Domain Models for Neural Machine Translation	Mar 5, 2020	Domain AdaptationKnowledge Distillation	—Unverified	0
An Efficient Method of Training Small Models for Regression Problems with Knowledge Distillation	Feb 28, 2020	Knowledge DistillationMemorization	—Unverified	0
TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing	Feb 28, 2020	Knowledge DistillationReading Comprehension	CodeCode Available	2
Efficient Semantic Video Segmentation with Per-frame Inference	Feb 26, 2020	Knowledge DistillationOptical Flow Estimation	CodeCode Available	1
Semi-Supervised Speech Recognition via Local Prior Matching	Feb 24, 2020	Knowledge DistillationLanguage Modeling	CodeCode Available	3
Residual Knowledge Distillation	Feb 21, 2020	Knowledge DistillationModel Compression	—Unverified	0
Balancing Cost and Benefit with Tied-Multi Transformers	Feb 20, 2020	DecoderKnowledge Distillation	—Unverified	0
The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding	Feb 19, 2020	Knowledge DistillationMulti-Task Learning	—Unverified	0
Knapsack Pruning with Inner Distillation	Feb 19, 2020	GPUKnowledge Distillation	CodeCode Available	1
Self-Distillation Amplifies Regularization in Hilbert Space	Feb 13, 2020	Knowledge DistillationL2 Regularization	—Unverified	0
Salvaging Federated Learning by Local Adaptation	Feb 12, 2020	Federated LearningKnowledge Distillation	CodeCode Available	1
Content Based Singing Voice Extraction From a Musical Mixture	Feb 12, 2020	DecoderDeep Learning	CodeCode Available	0
Meta-Learning across Meta-Tasks for Few-Shot Learning	Feb 11, 2020	Domain AdaptationFew-Shot Learning	—Unverified	0
Regularized Evolutionary Population-Based Training	Feb 11, 2020	Diversityimage-classification	—Unverified	0
Knowledge Distillation for Brain Tumor Segmentation	Feb 10, 2020	Brain Tumor SegmentationKnowledge Distillation	CodeCode Available	1
Understanding and Improving Knowledge Distillation	Feb 10, 2020	Knowledge DistillationModel Compression	—Unverified	0
Unlabeled Data Deployment for Classification of Diabetic Retinopathy Images Using Knowledge Transfer	Feb 9, 2020	General ClassificationKnowledge Distillation	—Unverified	0
SUOD: Toward Scalable Unsupervised Outlier Detection	Feb 8, 2020	Knowledge DistillationOutlier Detection	CodeCode Available	1
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing	Feb 7, 2020	Knowledge DistillationModel Compression	CodeCode Available	1
Feature-map-level Online Adversarial Knowledge Distillation	Feb 5, 2020	Knowledge Distillation	—Unverified	0
Periodic Intra-Ensemble Knowledge Distillation for Reinforcement Learning	Feb 1, 2020	Knowledge DistillationMuJoCo	CodeCode Available	0
Search for Better Students to Learn Distilled Knowledge	Jan 30, 2020	Knowledge DistillationModel Compression	—Unverified	0

Show:10 25 50

← PrevPage 79 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified