Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2026–2050 of 4240 papers

Title	Date	Tasks	Status
Align-to-Distill: Trainable Attention Alignment for Knowledge Distillation in Neural Machine Translation	Mar 3, 2024	Knowledge DistillationMachine Translation	CodeCode Available
Teaching MLP More Graph Information: A Three-stage Multitask Knowledge Distillation Framework	Mar 2, 2024	Knowledge Distillation	—Unverified
Distilling Text Style Transfer With Self-Explanation From LLMs	Mar 2, 2024	In-Context LearningKnowledge Distillation	—Unverified
Differentially Private Knowledge Distillation via Synthetic Text Generation	Mar 1, 2024	Knowledge DistillationModel Compression	CodeCode Available
Data-efficient Event Camera Pre-training via Disentangled Masked Modeling	Mar 1, 2024	Knowledge DistillationSelf-Supervised Learning	—Unverified
Direct Alignment of Draft Model for Speculative Decoding with Chat-Fine-Tuned LLMs	Feb 29, 2024	Dataset GenerationKnowledge Distillation	—Unverified
Weakly Supervised Monocular 3D Detection with a Single-View Image	Feb 29, 2024	Knowledge DistillationObject Localization	—Unverified
MIKO: Multimodal Intention Knowledge Distillation from Large Language Models for Social-Media Commonsense Discovery	Feb 28, 2024	Knowledge DistillationLanguage Modeling	—Unverified
Gradient Reweighting: Towards Imbalanced Class-Incremental Learning	Feb 28, 2024	class-incremental learningClass Incremental Learning	—Unverified
A Lightweight Low-Light Image Enhancement Network via Channel Prior and Gamma Correction	Feb 28, 2024	Image EnhancementKnowledge Distillation	—Unverified
3MVRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding	Feb 28, 2024	document understandingForm	CodeCode Available
Structural Teacher-Student Normality Learning for Multi-Class Anomaly Detection and Localization	Feb 27, 2024	Anomaly DetectionKnowledge Distillation	—Unverified
SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection	Feb 27, 2024	class-incremental learningClass Incremental Learning	—Unverified
MCF-VC: Mitigate Catastrophic Forgetting in Class-Incremental Learning for Multimodal Video Captioning	Feb 27, 2024	class-incremental learningClass Incremental Learning	—Unverified
LLM-based Privacy Data Augmentation Guided by Knowledge Distillation with a Distribution Tutor for Medical Text Classification	Feb 26, 2024	Data AugmentationKnowledge Distillation	—Unverified
SKILL: Similarity-aware Knowledge distILLation for Speech Self-Supervised Learning	Feb 26, 2024	Knowledge DistillationSelf-Supervised Learning	—Unverified
m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers	Feb 26, 2024	Knowledge DistillationMixture-of-Experts	CodeCode Available
DTCM: Deep Transformer Capsule Mutual Distillation for Multivariate Time Series Classification	Feb 26, 2024	Knowledge DistillationRelation Network	—Unverified
Distilling Adversarial Robustness Using Heterogeneous Teachers	Feb 23, 2024	Adversarial RobustnessKnowledge Distillation	—Unverified
Rethinking Invariance Regularization in Adversarial Training to Improve Robustness-Accuracy Trade-off	Feb 22, 2024	Adversarial DefenseKnowledge Distillation	—Unverified
Practical Insights into Knowledge Distillation for Pre-Trained Models	Feb 22, 2024	Federated LearningKnowledge Distillation	—Unverified
Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic	Feb 22, 2024	Formal LogicKnowledge Distillation	—Unverified
TIE-KD: Teacher-Independent and Explainable Knowledge Distillation for Monocular Depth Estimation	Feb 22, 2024	Depth EstimationKnowledge Distillation	CodeCode Available
Unsupervised Text Style Transfer via LLMs and Attention Masking with Multi-way Interactions	Feb 21, 2024	In-Context LearningKnowledge Distillation	—Unverified
In-Distribution Consistency Regularization Improves the Generalization of Quantization-Aware Training	Feb 21, 2024	Knowledge DistillationQuantization	—Unverified

Show:10 25 50

← PrevPage 82 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified