Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3901–3950 of 4240 papers

Title	Date	Tasks	Status
G^2D: Boosting Multimodal Learning with Gradient-Guided Distillation	Jun 26, 2025	Knowledge DistillationModel Optimization	CodeCode Available
Revisiting Knowledge Distillation for Autoregressive Language Models	Feb 19, 2024	Knowledge Distillation	CodeCode Available
Adaptive Distillation: Aggregating Knowledge from Multiple Paths for Efficient Distillation	Oct 19, 2021	Knowledge DistillationNeural Network Compression	CodeCode Available
Revisiting Knowledge Distillation under Distribution Shift	Dec 25, 2023	Data AugmentationDiversity	CodeCode Available
Distillation-based fabric anomaly detection	Jan 4, 2024	Anomaly DetectionDefect Detection	CodeCode Available
Multiple Teachers-Meticulous Student: A Domain Adaptive Meta-Knowledge Distillation Model for Medical Image Classification	Mar 17, 2024	image-classificationImage Classification	CodeCode Available
F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models	Sep 30, 2022	Knowledge Distillationobject-detection	CodeCode Available
Knowledge-guided Causal Intervention for Weakly-supervised Object Localization	Jan 3, 2023	Knowledge DistillationObject	CodeCode Available
Structured Knowledge Distillation for Dense Prediction	Mar 11, 2019	Depth EstimationGeneral Classification	CodeCode Available
Distill2Vec: Dynamic Graph Representation Learning with Knowledge Distillation	Nov 11, 2020	Graph Representation LearningKnowledge Distillation	CodeCode Available
Multi-source-free Domain Adaptation via Uncertainty-aware Adaptive Distillation	Feb 9, 2024	Domain AdaptationKnowledge Distillation	CodeCode Available
Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation	Jun 19, 2024	Knowledge Distillation	CodeCode Available
Multistage Collaborative Knowledge Distillation from a Large Language Model for Semi-Supervised Sequence Generation	Nov 15, 2023	Constituency ParsingKnowledge Distillation	CodeCode Available
Structured Knowledge Distillation for Semantic Segmentation	Jun 1, 2019	General Classificationimage-classification	CodeCode Available
Multi-stage Distillation Framework for Cross-Lingual Semantic Similarity Matching	Sep 13, 2022	Contrastive LearningKnowledge Distillation	CodeCode Available
Revisiting Knowledge Distillation via Label Smoothing Regularization	Sep 25, 2019	Knowledge DistillationSelf-Knowledge Distillation	CodeCode Available
Towards Class-wise Fair Adversarial Training via Anti-Bias Soft Label Distillation	Jun 10, 2025	Adversarial RobustnessFairness	CodeCode Available
WaterMono: Teacher-Guided Anomaly Masking and Enhancement Boosting for Robust Underwater Self-Supervised Monocular Depth Estimation	Jun 19, 2024	Depth EstimationImage Enhancement	CodeCode Available
Disentangling spatio-temporal knowledge for weakly supervised object detection and segmentation in surgical video	Jul 22, 2024	DisentanglementKnowledge Distillation	CodeCode Available
FS-BAN: Born-Again Networks for Domain Generalization Few-Shot Classification	Aug 23, 2022	Domain GeneralizationKnowledge Distillation	CodeCode Available
From underwater to aerial: a novel multi-scale knowledge distillation approach for coral reef monitoring	Feb 25, 2025	Knowledge Distillation	CodeCode Available
Preference-Consistent Knowledge Distillation for Recommender System	Nov 8, 2023	Knowledge DistillationRecommendation Systems	CodeCode Available
CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective	Apr 22, 2024	Contrastive Learningimage-classification	CodeCode Available
Multi-Teacher Knowledge Distillation For Text Image Machine Translation	May 9, 2023	DecoderKnowledge Distillation	CodeCode Available
Multi Teacher Privileged Knowledge Distillation for Multimodal Expression Recognition	Aug 16, 2024	Emotion RecognitionKnowledge Distillation	CodeCode Available
Multi-to-Single Knowledge Distillation for Point Cloud Semantic Segmentation	Apr 28, 2023	Knowledge DistillationSemantic Segmentation	CodeCode Available
Right Time to Learn:Promoting Generalization via Bio-inspired Spacing Effect in Knowledge Distillation	Feb 10, 2025	Knowledge Distillation	CodeCode Available
Chemical transformer compression for accelerating both training and inference of molecular modeling	May 16, 2022	Knowledge DistillationModel Compression	CodeCode Available
Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning	Sep 16, 2024	Few-Shot Learningimage-classification	CodeCode Available
An Efficient End-to-End Approach to Noise Invariant Speech Features via Multi-Task Learning	Mar 13, 2024	DenoisingKnowledge Distillation	CodeCode Available
Frameless Graph Knowledge Distillation	Jul 13, 2023	Graph Representation LearningKnowledge Distillation	CodeCode Available
Discourse Structures Guided Fine-grained Propaganda Identification	Oct 28, 2023	AttributeKnowledge Distillation	CodeCode Available
MiniDisc: Minimal Distillation Schedule for Language Model Compression	May 29, 2022	Knowledge DistillationLanguage Modeling	CodeCode Available
FractalAD: A simple industrial anomaly detection method using fractal anomaly generation and backbone knowledge distillation	Jan 30, 2023	Anomaly DetectionKnowledge Distillation	CodeCode Available
Student Becomes Decathlon Master in Retinal Vessel Segmentation via Dual-teacher Multi-target Domain Adaptation	Mar 7, 2022	Domain AdaptationDomain Generalization	CodeCode Available
Robust and Accurate Object Detection via Self-Knowledge Distillation	Nov 14, 2021	Adversarial RobustnessKnowledge Distillation	CodeCode Available
Mutli-View 3D Reconstruction using Knowledge Distillation	Dec 2, 2024	3D ReconstructionDepth Estimation	CodeCode Available
Exploring Generalizable Distillation for Efficient Medical Image Segmentation	Jul 26, 2022	DecoderImage Segmentation	CodeCode Available
DiSCo: LLM Knowledge Distillation for Efficient Sparse Retrieval in Conversational Search	Oct 18, 2024	Conversational Information AccessConversational Search	CodeCode Available
Digital Staining with Knowledge Distillation: A Unified Framework for Unpaired and Paired-But-Misaligned Data	Apr 14, 2025	ColorizationKnowledge Distillation	CodeCode Available
Analyzing the Confidentiality of Undistillable Teachers in Knowledge Distillation	Dec 1, 2021	Knowledge Distillation	CodeCode Available
Mutual-Learning Knowledge Distillation for Nighttime UAV Tracking	Dec 13, 2023	Knowledge Distillation	CodeCode Available
A Unified Object Counting Network with Object Occupation Prior	Dec 29, 2022	Crowd CountingKnowledge Distillation	CodeCode Available
Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation	Oct 1, 2021	Knowledge DistillationSelf-Knowledge Distillation	CodeCode Available
Towards Data-Free Domain Generalization	Oct 9, 2021	Data-free Knowledge DistillationDomain Generalization	CodeCode Available
MV-MR: multi-views and multi-representations for self-supervised learning and knowledge distillation	Mar 21, 2023	ClusteringContrastive Learning	CodeCode Available
Robust Knowledge Distillation Based on Feature Variance Against Backdoored Teacher Model	Jun 1, 2024	Knowledge DistillationModel Compression	CodeCode Available
Accelerated Proton Resonance Frequency-based Magnetic Resonance Thermometry by Optimized Deep Learning Method	Jul 3, 2024	Knowledge Distillation	CodeCode Available
Foundation Models for Structural Health Monitoring	Apr 3, 2024	Anomaly DetectionKnowledge Distillation	CodeCode Available
Natural Language Generation for Effective Knowledge Distillation	Nov 1, 2019	Knowledge DistillationLinguistic Acceptability	CodeCode Available

Show:10 25 50

← PrevPage 79 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified