Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–150 of 4240 papers

Title	Date	Tasks	Status	Hype
Accessing Vision Foundation Models at ImageNet-level Costs	Jul 15, 2024	Knowledge DistillationTransfer Learning	CodeCode Available	2
Scaled Decoupled Distillation	Jan 1, 2024	Knowledge Distillation	CodeCode Available	2
Efficient Multivariate Time Series Forecasting via Calibrated Language Models with Privileged Knowledge Distillation	May 4, 2025	Knowledge DistillationMultivariate Time Series Forecasting	CodeCode Available	2
Semi-Supervised Domain Generalizable Person Re-Identification	Aug 11, 2021	Generalizable Person Re-identificationKnowledge Distillation	CodeCode Available	2
Sinkhorn Distance Minimization for Knowledge Distillation	Feb 27, 2024	DecoderKnowledge Distillation	CodeCode Available	2
Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study	Jun 20, 2024	In-Context LearningKnowledge Distillation	CodeCode Available	2
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement	Apr 10, 2025	Knowledge DistillationVisual Reasoning	CodeCode Available	2
SSDA-YOLO: Semi-supervised Domain Adaptive YOLO for Cross-Domain Object Detection	Nov 4, 2022	Domain AdaptationKnowledge Distillation	CodeCode Available	2
CALF: Aligning LLMs for Time Series Forecasting via Cross-modal Fine-Tuning	Mar 12, 2024	Knowledge DistillationMultivariate Time Series Forecasting	CodeCode Available	2
Rethinking Transformer-Based Blind-Spot Network for Self-Supervised Image Denoising	Apr 11, 2024	Computational EfficiencyDenoising	CodeCode Available	2
Event Stream-based Visual Object Tracking: A High-Resolution Benchmark Dataset and A Novel Baseline	Sep 26, 2023	Knowledge DistillationObject Tracking	CodeCode Available	2
Distillation-Supervised Convolutional Low-Rank Adaptation for Efficient Image Super-Resolution	Apr 15, 2025	Image Super-ResolutionKnowledge Distillation	CodeCode Available	2
DOT: A Distillation-Oriented Trainer	Jul 17, 2023	Knowledge Distillation	CodeCode Available	2
VkD: Improving Knowledge Distillation using Orthogonal Projections	Jan 1, 2024	Image GenerationKnowledge Distillation	CodeCode Available	2
Decoupled Knowledge Distillation	Mar 16, 2022	image-classificationImage Classification	CodeCode Available	2
Scalable Zero-shot Entity Linking with Dense Entity Retrieval	Nov 10, 2019	Entity EmbeddingsEntity Linking	CodeCode Available	2
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation	Jul 3, 2024	Domain GeneralizationKnowledge Distillation	CodeCode Available	2
Improving the Training of Rectified Flows	May 30, 2024	Image GenerationKnowledge Distillation	CodeCode Available	2
ConDistFL: Conditional Distillation for Federated Learning from Partially Annotated Data	Aug 8, 2023	Federated LearningKnowledge Distillation	CodeCode Available	2
Diffusion Time-step Curriculum for One Image to 3D Generation	Apr 6, 2024	3D GenerationImage to 3D	CodeCode Available	2
Are Large Kernels Better Teachers than Transformers for ConvNets?	May 30, 2023	Knowledge Distillation	CodeCode Available	2
Dual-Space Knowledge Distillation for Large Language Models	Jun 25, 2024	Instruction FollowingKnowledge Distillation	CodeCode Available	2
Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference	Dec 15, 2023	DecoderDenoising	CodeCode Available	2
Let Images Give You More:Point Cloud Cross-Modal Training for Shape Analysis	Oct 9, 2022	3D Point Cloud ClassificationKnowledge Distillation	CodeCode Available	2
Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies	Jan 4, 2025	Edge-computingKnowledge Distillation	CodeCode Available	2
Anomaly Detection via Reverse Distillation from One-Class Embedding	Jan 26, 2022	Anomaly Classification	CodeCode Available	2
2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds	Jul 10, 2022	3D Semantic SegmentationAutonomous Driving	CodeCode Available	2
OBSeg: Accurate and Fast Instance Segmentation Framework Using Segmentation Foundation Models with Oriented Bounding Box Prompts	Jan 16, 2024	Amodal Instance SegmentationInstance Segmentation	CodeCode Available	2
Cross-Image Relational Knowledge Distillation for Semantic Segmentation	Apr 14, 2022	Knowledge DistillationSegmentation	CodeCode Available	2
A Deep Knowledge Distillation framework for EEG assisted enhancement of single-lead ECG based sleep staging	Dec 14, 2021	ECG based Sleep StagingEEG	CodeCode Available	1
Collaborative Distillation for Ultra-Resolution Universal Style Transfer	Mar 18, 2020	DecoderGPU	CodeCode Available	1
Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks	Mar 25, 2022	Incremental LearningKnowledge Distillation	CodeCode Available	1
Coaching a Teachable Student	Jun 16, 2023	CARLA longest6Knowledge Distillation	CodeCode Available	1
COMEDIAN: Self-Supervised Learning and Knowledge Distillation for Action Spotting using Transformers	Sep 3, 2023	Action DetectionAction Spotting	CodeCode Available	1
CLRKDNet: Speeding up Lane Detection with Knowledge Distillation	May 21, 2024	Autonomous DrivingKnowledge Distillation	CodeCode Available	1
Adapt Your Teacher: Improving Knowledge Distillation for Exemplar-free Continual Learning	Aug 18, 2023	class-incremental learningClass Incremental Learning	CodeCode Available	1
CMDFusion: Bidirectional Fusion Network with Cross-modality Knowledge Distillation for LIDAR Semantic Segmentation	Jul 9, 2023	Autonomous VehiclesKnowledge Distillation	CodeCode Available	1
Aggretriever: A Simple Approach to Aggregate Textual Representations for Robust Dense Passage Retrieval	Jul 31, 2022	Knowledge DistillationLanguage Modeling	CodeCode Available	1
Cloud Object Detector Adaptation by Integrating Different Source Knowledge	Dec 10, 2024	Domain AdaptationKnowledge Distillation	CodeCode Available	1
CMD: Self-supervised 3D Action Representation Learning with Cross-modal Mutual Distillation	Aug 26, 2022	3D Action RecognitionAction Recognition	CodeCode Available	1
Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation	Jan 25, 2024	ClusteringFederated Learning	CodeCode Available	1
CLIP-KD: An Empirical Study of CLIP Model Distillation	Jul 24, 2023	Contrastive LearningCross-Modal Retrieval	CodeCode Available	1
CLIP-guided Federated Learning on Heterogeneous and Long-Tailed Data	Dec 14, 2023	Contrastive LearningFederated Learning	CodeCode Available	1
CLIP model is an Efficient Continual Learner	Oct 6, 2022	Continual LearningIncremental Learning	CodeCode Available	1
Understanding the Role of the Projector in Knowledge Distillation	Mar 20, 2023	image-classificationImage Classification	CodeCode Available	1
Adaptive Multi-Teacher Multi-level Knowledge Distillation	Mar 6, 2021	Knowledge Distillation	CodeCode Available	1
CLIP-Embed-KD: Computationally Efficient Knowledge Distillation Using Embeddings as Teachers	Apr 9, 2024	Knowledge DistillationZero-shot Generalization	CodeCode Available	1
CL-LoRA: Continual Low-Rank Adaptation for Rehearsal-Free Class-Incremental Learning	May 30, 2025	class-incremental learningClass Incremental Learning	CodeCode Available	1
Adaptive Multi-Teacher Knowledge Distillation with Meta-Learning	Jun 11, 2023	Knowledge DistillationMeta-Learning	CodeCode Available	1
Class-Incremental Learning by Knowledge Distillation with Adaptive Feature Consolidation	Apr 2, 2022	class-incremental learningClass Incremental Learning	CodeCode Available	1

Show:10 25 50

← PrevPage 3 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified