Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2701–2725 of 4240 papers

Title	Date	Tasks	Status	Hype
MiniDisc: Minimal Distillation Schedule for Language Model Compression	May 29, 2022	Knowledge DistillationLanguage Modeling	CodeCode Available	0
Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors	May 28, 2022	Domain AdaptationKnowledge Distillation	CodeCode Available	1
One Reference Is Not Enough: Diverse Distillation with Reference Selection for Non-Autoregressive Translation	May 28, 2022	Knowledge DistillationMachine Translation	CodeCode Available	0
Parameter-Efficient and Student-Friendly Knowledge Distillation	May 28, 2022	Knowledge DistillationTransfer Learning	—Unverified	0
Geometer: Graph Few-Shot Class-Incremental Learning via Prototype Representation	May 27, 2022	class-incremental learningClass Incremental Learning	CodeCode Available	1
Continual evaluation for lifelong learning: Identifying the stability gap	May 26, 2022	Continual LearningIncremental Learning	CodeCode Available	1
Region-aware Knowledge Distillation for Efficient Image-to-Image Translation	May 25, 2022	Contrastive Learningimage-classification	—Unverified	0
Do we need Label Regularization to Fine-tune Pre-trained Language Models?	May 25, 2022	Knowledge DistillationModel Compression	—Unverified	0
DFM: Dialogue Foundation Model for Universal Large-Scale Dialogue-Oriented Task Learning	May 25, 2022	Dialogue GenerationDiversity	—Unverified	0
Compressing Deep Graph Neural Networks via Adversarial Knowledge Distillation	May 24, 2022	Graph ClassificationKnowledge Distillation	CodeCode Available	1
Optimizing Performance of Federated Person Re-identification: Benchmarking and Analysis	May 24, 2022	BenchmarkingFederated Learning	CodeCode Available	1
CDFKD-MFS: Collaborative Data-free Knowledge Distillation via Multi-level Feature Sharing	May 24, 2022	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available	0
IDEAL: Query-Efficient Data-Free Learning from Black-box Models	May 23, 2022	Knowledge Distillation	CodeCode Available	1
Boosting Multi-Label Image Classification with Complementary Parallel Self-Distillation	May 23, 2022	image-classificationImage Classification	CodeCode Available	1
PointDistiller: Structured Knowledge Distillation Towards Efficient and Compact 3D Detection	May 23, 2022	3D Object DetectionKnowledge Distillation	CodeCode Available	1
LILA-BOTI : Leveraging Isolated Letter Accumulations By Ordering Teacher Insights for Bangla Handwriting Recognition	May 23, 2022	Handwriting RecognitionKnowledge Distillation	CodeCode Available	0
Knowledge Distillation via the Target-aware Transformer	May 22, 2022	Knowledge Distillation	CodeCode Available	1
Aligning Logits Generatively for Principled Black-Box Knowledge Distillation	May 21, 2022	Federated LearningKnowledge Distillation	CodeCode Available	0
Knowledge Distillation from A Stronger Teacher	May 21, 2022	image-classificationImage Classification	CodeCode Available	1
Exploring Extreme Parameter Compression for Pre-trained Language Models	May 20, 2022	Knowledge DistillationTensor Decomposition	CodeCode Available	1
InDistill: Information flow-preserving knowledge distillation for model compression	May 20, 2022	Knowledge DistillationModel Compression	CodeCode Available	0
Simple Regularisation for Uncertainty-Aware Knowledge Distillation	May 19, 2022	BIG-bench Machine LearningDiversity	—Unverified	0
ERNIE-Search: Bridging Cross-Encoder with Dual-Encoder via Self On-the-fly Distillation for Dense Passage Retrieval	May 18, 2022	Knowledge DistillationOpen-Domain Question Answering	—Unverified	0
Prompting to Distill: Boosting Data-Free Knowledge Distillation via Reinforced Prompt	May 16, 2022	Data-free Knowledge DistillationKnowledge Distillation	—Unverified	0
Chemical transformer compression for accelerating both training and inference of molecular modeling	May 16, 2022	Knowledge DistillationModel Compression	CodeCode Available	0

Show:10 25 50

← PrevPage 109 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified