Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 901–950 of 4240 papers

Title	Date	Tasks	Status	Hype
BiLD: Bi-directional Logits Difference Loss for Large Language Model Distillation	Jun 19, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available	1
Simplified TinyBERT: Knowledge Distillation for Document Retrieval	Sep 16, 2020	Document RankingKnowledge Distillation	CodeCode Available	1
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval	Oct 7, 2022	Knowledge DistillationRetrieval	CodeCode Available	1
SLMRec: Distilling Large Language Models into Small for Sequential Recommendation	May 28, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available	1
A New Knowledge Distillation Network for Incremental Few-Shot Surface Defect Detection	Sep 1, 2022	Defect DetectionKnowledge Distillation	CodeCode Available	1
Decoupled Multimodal Distilling for Emotion Recognition	Mar 24, 2023	Emotion RecognitionKnowledge Distillation	CodeCode Available	1
Does Knowledge Distillation Really Work?	Jun 10, 2021	Knowledge Distillation	CodeCode Available	1
A Neural Span-Based Continual Named Entity Recognition Model	Feb 23, 2023	Continual LearningContinual Named Entity Recognition	CodeCode Available	1
Domain Generalization for Prostate Segmentation in Transrectal Ultrasound Images: A Multi-center Study	Sep 5, 2022	Domain AdaptationDomain Generalization	CodeCode Available	1
DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models	May 28, 2023	Knowledge DistillationSelf-Supervised Learning	CodeCode Available	1
DM-VTON: Distilled Mobile Real-time Virtual Try-On	Aug 26, 2023	GPUHuman Parsing	CodeCode Available	1
DeepAqua: Self-Supervised Semantic Segmentation of Wetland Surface Water Extent with SAR Images using Knowledge Distillation	May 2, 2023	Knowledge DistillationSemantic Segmentation	CodeCode Available	1
Hybrid Inverted Index Is a Robust Accelerator for Dense Retrieval	Oct 11, 2022	Knowledge DistillationQuantization	CodeCode Available	1
SpikingBERT: Distilling BERT to Train Spiking Language Models Using Implicit Differentiation	Aug 21, 2023	Knowledge DistillationLanguage Modelling	CodeCode Available	1
DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture	Sep 5, 2024	Data-free Knowledge DistillationDenoising	CodeCode Available	1
DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval	Jun 24, 2021	Computational EfficiencyKnowledge Distillation	CodeCode Available	1
Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic Hashing	Mar 10, 2024	Image RetrievalKnowledge Distillation	CodeCode Available	1
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation	Jun 18, 2020	DecoderKnowledge Distillation	CodeCode Available	1
AMFD: Distillation via Adaptive Multimodal Fusion for Multispectral Pedestrian Detection	May 21, 2024	Knowledge DistillationPedestrian Detection	CodeCode Available	1
Structured Sparse R-CNN for Direct Scene Graph Generation	Jun 21, 2021	graph constructionGraph Generation	CodeCode Available	1
Deep Graph-level Anomaly Detection by Glocal Knowledge Distillation	Dec 19, 2021	Anomaly DetectionKnowledge Distillation	CodeCode Available	1
DeepKD: A Deeply Decoupled and Denoised Knowledge Distillation Trainer	May 21, 2025	DenoisingKnowledge Distillation	CodeCode Available	1
Black-Box Attacks on Sequential Recommenders via Data-Free Model Extraction	Sep 1, 2021	Data PoisoningKnowledge Distillation	CodeCode Available	1
SUOD: Toward Scalable Unsupervised Outlier Detection	Feb 8, 2020	Knowledge DistillationOutlier Detection	CodeCode Available	1
Adaptive Multi-Teacher Knowledge Distillation with Meta-Learning	Jun 11, 2023	Knowledge DistillationMeta-Learning	CodeCode Available	1
Supervised Compression for Resource-Constrained Edge Computing Systems	Aug 21, 2021	Data CompressionEdge-computing	CodeCode Available	1
Black-box Few-shot Knowledge Distillation	Jul 25, 2022	image-classificationImage Classification	CodeCode Available	1
Distill on the Go: Online knowledge distillation in self-supervised learning	Apr 20, 2021	Knowledge DistillationSelf-Supervised Learning	CodeCode Available	1
DIOD: Self-Distillation Meets Object Discovery	Jan 1, 2024	Instance SegmentationKnowledge Distillation	CodeCode Available	1
Tailoring Instructions to Student's Learning Levels Boosts Knowledge Distillation	May 16, 2023	Knowledge Distillationtext-classification	CodeCode Available	1
Distilling a Powerful Student Model via Online Knowledge Distillation	Mar 26, 2021	Knowledge Distillation	CodeCode Available	1
Deep Semi-supervised Knowledge Distillation for Overlapping Cervical Cell Instance Segmentation	Jul 21, 2020	Instance SegmentationKnowledge Distillation	CodeCode Available	1
Teachers Do More Than Teach: Compressing Image-to-Image Models	Mar 5, 2021	Knowledge Distillation	CodeCode Available	1
Deep Structured Instance Graph for Distilling Object Detectors	Sep 27, 2021	Instance SegmentationKnowledge Distillation	CodeCode Available	1
Document-Level Relation Extraction with Adaptive Focal Loss and Knowledge Distillation	Mar 21, 2022	Document-level Relation ExtractionKnowledge Distillation	CodeCode Available	1
Distill the Image to Nowhere: Inversion Knowledge Distillation for Multimodal Machine Translation	Oct 10, 2022	Knowledge DistillationMachine Translation	CodeCode Available	1
Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation	Jun 1, 2020	Knowledge DistillationNeural Architecture Search	CodeCode Available	1
Temporal Self-Ensembling Teacher for Semi-Supervised Object Detection	Jul 13, 2020	image-classificationImage Classification	CodeCode Available	1
Defocus Blur Detection via Depth Distillation	Jul 16, 2020	DecoderDefocus Blur Detection	CodeCode Available	1
Deformation Flow Based Two-Stream Network for Lip Reading	Mar 12, 2020	Knowledge DistillationLipreading	CodeCode Available	1
The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation	Jun 13, 2022	Knowledge DistillationTransfer Learning	CodeCode Available	1
Deliberated Domain Bridging for Domain Adaptive Semantic Segmentation	Sep 16, 2022	Domain AdaptationImage-to-Image Translation	CodeCode Available	1
Deliberation on Priors: Trustworthy Reasoning of Large Language Models on Knowledge Graphs	May 21, 2025	Knowledge DistillationKnowledge Graphs	CodeCode Available	1
DistilPose: Tokenized Pose Regression with Heatmap Distillation	Mar 4, 2023	Knowledge DistillationPose Estimation	CodeCode Available	1
DPM-OT: A New Diffusion Probabilistic Model Based on Optimal Transport	Jul 21, 2023	DenoisingKnowledge Distillation	CodeCode Available	1
EchoDFKD: Data-Free Knowledge Distillation for Cardiac Ultrasound Segmentation using Synthetic Data	Sep 11, 2024	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available	1
End-to-End Zero-Shot HOI Detection via Vision and Language Knowledge Distillation	Apr 1, 2022	Human-Object Interaction DetectionKnowledge Distillation	CodeCode Available	1
FerKD: Surgical Label Adaptation for Efficient Distillation	Dec 29, 2023	Knowledge Distillation	CodeCode Available	1
Dense Interspecies Face Embedding	Nov 28, 2022	Image ManipulationInterspecies Facial Keypoint Transfer	CodeCode Available	1
Instance-Conditional Knowledge Distillation for Object Detection	Oct 25, 2021	Image ClassificationKnowledge Distillation	CodeCode Available	1

Show:10 25 50

← PrevPage 19 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified