Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 801–850 of 4240 papers

Title	Date	Tasks	Status	Hype
CheXseg: Combining Expert Annotations with DNN-generated Saliency Maps for X-ray Segmentation	Feb 21, 2021	Image SegmentationKnowledge Distillation	CodeCode Available	1
Show, Attend and Distill:Knowledge Distillation via Attention-based Feature Matching	Feb 5, 2021	General KnowledgeKnowledge Distillation	CodeCode Available	1
ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models	Feb 4, 2021	AttributeBIG-bench Machine Learning	CodeCode Available	1
Rethinking Soft Labels for Knowledge Distillation: A Bias-Variance Tradeoff Perspective	Feb 1, 2021	Knowledge Distillation	CodeCode Available	1
Memory-Efficient Semi-Supervised Continual Learning: The World is its Own Replay Buffer	Jan 23, 2021	Continual LearningKnowledge Distillation	CodeCode Available	1
SEED: Self-supervised Distillation For Visual Representation	Jan 12, 2021	Knowledge DistillationSelf-Supervised Learning	CodeCode Available	1
Knowledge Distillation in Iterative Generative Models for Improved Sampling Speed	Jan 7, 2021	DenoisingImage Generation	CodeCode Available	1
Self-Mutual Distillation Learning for Continuous Sign Language Recognition	Jan 1, 2021	Knowledge DistillationSign Language Recognition	CodeCode Available	1
Exploring Inter-Channel Correlation for Diversity-Preserved Knowledge Distillation	Jan 1, 2021	DiversityKnowledge Distillation	CodeCode Available	1
Improve Object Detection with Feature-based Knowledge Distillation: Towards Accurate and Efficient Detectors	Jan 1, 2021	image-classificationImage Classification	CodeCode Available	1
Unified Mandarin TTS Front-end Based on Distilled BERT Model	Dec 31, 2020	Knowledge DistillationLanguage Modeling	CodeCode Available	1
CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade	Dec 29, 2020	Knowledge DistillationModel Selection	CodeCode Available	1
Learning Light-Weight Translation Models from Deep Transformer	Dec 27, 2020	Knowledge DistillationMachine Translation	CodeCode Available	1
Invariant Teacher and Equivariant Student for Unsupervised 3D Human Pose Estimation	Dec 17, 2020	3D Human Pose EstimationKnowledge Distillation	CodeCode Available	1
Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup	Dec 17, 2020	InformativenessKnowledge Distillation	CodeCode Available	1
Progressive Network Grafting for Few-Shot Knowledge Distillation	Dec 9, 2020	Knowledge DistillationModel Compression	CodeCode Available	1
Distilling Knowledge from Reader to Retriever for Question Answering	Dec 8, 2020	Information RetrievalKnowledge Distillation	CodeCode Available	1
DE-RRD: A Knowledge Distillation Framework for Recommender System	Dec 8, 2020	Knowledge DistillationModel Compression	CodeCode Available	1
Cross-Layer Distillation with Semantic Calibration	Dec 6, 2020	Knowledge DistillationTransfer Learning	CodeCode Available	1
What Makes a "Good" Data Augmentation in Knowledge Distillation -- A Statistical Perspective	Dec 5, 2020	Active LearningData Augmentation	CodeCode Available	1
Going Beyond Classification Accuracy Metrics in Model Compression	Dec 3, 2020	ClassificationEdge-computing	CodeCode Available	1
Multi-level Knowledge Distillation via Knowledge Alignment and Correlation	Dec 1, 2020	Contrastive LearningKnowledge Distillation	CodeCode Available	1
Agree to Disagree: Adaptive Ensemble Knowledge Distillation in Gradient Space	Dec 1, 2020	DiversityKnowledge Distillation	CodeCode Available	1
Knowledge Base Embedding By Cooperative Knowledge Distillation	Dec 1, 2020	Knowledge DistillationRepresentation Learning	CodeCode Available	1
Task-Oriented Feature Distillation	Dec 1, 2020	3D ClassificationGeneral Classification	CodeCode Available	1
KD-Lib: A PyTorch library for Knowledge Distillation, Pruning and Quantization	Nov 30, 2020	Knowledge DistillationModel Compression	CodeCode Available	1
Prototype-based Incremental Few-Shot Semantic Segmentation	Nov 30, 2020	Few-Shot Semantic SegmentationIncremental Learning	CodeCode Available	1
Channel-wise Knowledge Distillation for Dense Prediction	Nov 26, 2020	Knowledge DistillationPrediction	CodeCode Available	1
Multiresolution Knowledge Distillation for Anomaly Detection	Nov 22, 2020	Anomaly DetectionAnomaly Localization	CodeCode Available	1
Evolving Search Space for Neural Architecture Search	Nov 22, 2020	Knowledge DistillationNeural Architecture Search	CodeCode Available	1
Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems	Nov 20, 2020	Edge-computingimage-classification	CodeCode Available	1
KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation via Knowledge Distillation	Nov 19, 2020	Domain AdaptationKnowledge Distillation	CodeCode Available	1
Anomaly Detection in Video via Self-Supervised and Multi-Task Learning	Nov 15, 2020	Abnormal Event Detection In VideoAnomaly Detection	CodeCode Available	1
Federated Knowledge Distillation	Nov 4, 2020	Federated LearningKnowledge Distillation	CodeCode Available	1
Domain Adaptive Knowledge Distillation for Driving Scene Semantic Segmentation	Nov 3, 2020	Autonomous DrivingKnowledge Distillation	CodeCode Available	1
FastFormers: Highly Efficient Transformer Models for Natural Language Understanding	Oct 26, 2020	CPUGPU	CodeCode Available	1
Multi-Task Learning with Shared Encoder for Non-Autoregressive Machine Translation	Oct 24, 2020	Knowledge DistillationMachine Translation	CodeCode Available	1
Distilling Dense Representations for Ranking using Tightly-Coupled Teachers	Oct 22, 2020	Knowledge Distillation	CodeCode Available	1
Knowledge Distillation for BERT Unsupervised Domain Adaptation	Oct 22, 2020	Domain AdaptationGeneral Classification	CodeCode Available	1
Reducing the Teacher-Student Gap via Spherical Knowledge Disitllation	Oct 15, 2020	Knowledge Distillation	CodeCode Available	1
Task Decoupled Knowledge Distillation For Lightweight Face Detectors	Oct 14, 2020	Face DetectionKnowledge Distillation	CodeCode Available	1
Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation	Oct 6, 2020	Knowledge DistillationPassage Ranking	CodeCode Available	1
Improving Neural Topic Models using Knowledge Distillation	Oct 5, 2020	Knowledge DistillationTopic Models	CodeCode Available	1
Lifelong Language Knowledge Distillation	Oct 5, 2020	Knowledge DistillationLanguage Modelling	CodeCode Available	1
Self-training Improves Pre-training for Natural Language Understanding	Oct 5, 2020	Data AugmentationFew-Shot Learning	CodeCode Available	1
Contrastive Distillation on Intermediate Representations for Language Model Compression	Sep 29, 2020	Knowledge DistillationLanguage Modeling	CodeCode Available	1
TinyGAN: Distilling BigGAN for Conditional Image Generation	Sep 29, 2020	Conditional Image GenerationImage Generation	CodeCode Available	1
Densely Guided Knowledge Distillation using Multiple Teacher Assistants	Sep 18, 2020	Knowledge DistillationModel Compression	CodeCode Available	1
MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks	Sep 17, 2020	Image ClassificationKnowledge Distillation	CodeCode Available	1
S2SD: Simultaneous Similarity-based Self-Distillation for Deep Metric Learning	Sep 17, 2020	Knowledge DistillationMetric Learning	CodeCode Available	1

Show:10 25 50

← PrevPage 17 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified