Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1301–1325 of 4240 papers

Title	Date	Tasks	Status
Extracurricular Learning: Knowledge Transfer Beyond Empirical Distribution	Jun 30, 2020	Image ClassificationKnowledge Distillation	—Unverified
Extreme compression of sentence-transformer ranker models: faster inference, longer battery life, and less storage on edge devices	Jun 29, 2022	Dimensionality ReductionKnowledge Distillation	—Unverified
AKD : Adversarial Knowledge Distillation For Large Language Models Alignment on Coding tasks	May 5, 2025	Code CompletionCode Generation	—Unverified
Cooperative Learning for Cost-Adaptive Inference	Dec 13, 2023	Knowledge Distillation	—Unverified
Extracting knowledge from features with multilevel abstraction	Dec 4, 2021	Data AugmentationKnowledge Distillation	—Unverified
A Knowledge Distillation Approach for Sepsis Outcome Prediction from Multivariate Clinical Time Series	Nov 16, 2023	Knowledge DistillationTime Series	—Unverified
Cooperative Denoising for Distantly Supervised Relation Extraction	Aug 1, 2018	DenoisingInformation Retrieval	—Unverified
On Importance of Pruning and Distillation for Efficient Low Resource NLP	Sep 21, 2024	Document ClassificationGPU	—Unverified
Automated Graph Self-supervised Learning via Multi-teacher Knowledge Distillation	Oct 5, 2022	Graph Representation LearningKnowledge Distillation	—Unverified
Convolutional Neural Network Compression through Generalized Kronecker Product Decomposition	Sep 29, 2021	image-classificationImage Classification	—Unverified
Automated Channel Pruning with Learned Importance	Sep 29, 2021	DenoisingGPU	—Unverified
Control Policy Correction Framework for Reinforcement Learning-based Energy Arbitrage Strategies	Apr 29, 2024	Knowledge Distillationreinforcement-learning	—Unverified
Controlling the Quality of Distillation in Response-Based Network Compression	Dec 19, 2021	Knowledge Distillation	—Unverified
Extracting General-use Transformers for Low-resource Languages via Knowledge Distillation	Jan 22, 2025	Knowledge Distillation	—Unverified
Extract then Distill: Efficient and Effective Task-Agnostic BERT Distillation	Apr 24, 2021	Knowledge Distillation	—Unverified
Extremely Small BERT Models from Mixed-Vocabulary Training	Sep 25, 2019	Knowledge DistillationLanguage Modelling	—Unverified
Factual Dialogue Summarization via Learning from Large Language Models	Jun 20, 2024	Contrastive LearningData Augmentation	—Unverified
Faithful Knowledge Distillation	Jun 7, 2023	Adversarial RobustnessKnowledge Distillation	—Unverified
Feature-Align Network with Knowledge Distillation for Efficient Denoising	Mar 2, 2021	DecoderDenoising	—Unverified
FedSDD: Scalable and Diversity-enhanced Distillation for Model Aggregation in Federated Learning	Dec 28, 2023	DiversityFederated Learning	—Unverified
Contrast-reconstruction Representation Learning for Self-supervised Skeleton-based Action Recognition	Nov 22, 2021	Action RecognitionContrastive Learning	—Unverified
Contrast R-CNN for Continual Learning in Object Detection	Jul 11, 2021	Continual Learningimage-classification	—Unverified
AUTOKD: Automatic Knowledge Distillation Into A Student Architecture Family	Nov 5, 2021	Bayesian OptimizationKnowledge Distillation	—Unverified
Contrastive Representation Distillation via Multi-Scale Feature Decoupling	Feb 9, 2025	Knowledge DistillationTransfer Learning	—Unverified
A Joint Sequential and Relational Model for Frame-Semantic Parsing	Sep 1, 2017	Knowledge DistillationMachine Translation	—Unverified

Show:10 25 50

← PrevPage 53 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified