Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–150 of 4240 papers

Title	Date	Tasks	Status	Hype	Score
Accessing Vision Foundation Models at ImageNet-level Costs	Jul 15, 2024	Knowledge DistillationTransfer Learning	CodeCode Available	2	5
Scale Decoupled Distillation	Mar 20, 2024	Knowledge Distillation	CodeCode Available	2	5
Focal Loss for Dense Object Detection	Aug 7, 2017	2D Object DetectionDense Object Detection	CodeCode Available	2	5
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future	Jul 18, 2023	Knowledge Distillationobject-detection	CodeCode Available	2	5
Sinkhorn Distance Minimization for Knowledge Distillation	Feb 27, 2024	DecoderKnowledge Distillation	CodeCode Available	2	5
Social4Rec: Distilling User Preference from Social Graph for Video Recommendation in Tencent	Feb 20, 2023	Knowledge DistillationRecommendation Systems	CodeCode Available	2	5
SSDA-YOLO: Semi-supervised Domain Adaptive YOLO for Cross-Domain Object Detection	Nov 4, 2022	Domain AdaptationKnowledge Distillation	CodeCode Available	2	5
Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs	May 12, 2025	AI AgentKnowledge Distillation	CodeCode Available	2	5
Rethinking Transformer-Based Blind-Spot Network for Self-Supervised Image Denoising	Apr 11, 2024	Computational EfficiencyDenoising	CodeCode Available	2	5
TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing	Feb 28, 2020	Knowledge DistillationReading Comprehension	CodeCode Available	2	5
Distillation-Supervised Convolutional Low-Rank Adaptation for Efficient Image Super-Resolution	Apr 15, 2025	Image Super-ResolutionKnowledge Distillation	CodeCode Available	2	5
Distillation-Free One-Step Diffusion for Real-World Image Super-Resolution	Oct 5, 2024	Image Super-ResolutionKnowledge Distillation	CodeCode Available	2	5
VkD: Improving Knowledge Distillation using Orthogonal Projections	Jan 1, 2024	Image GenerationKnowledge Distillation	CodeCode Available	2	5
Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation	Dec 11, 2024	image-classificationImage Classification	CodeCode Available	2	5
DOT: A Distillation-Oriented Trainer	Jul 17, 2023	Knowledge Distillation	CodeCode Available	2	5
Scalable Zero-shot Entity Linking with Dense Entity Retrieval	Nov 10, 2019	Entity EmbeddingsEntity Linking	CodeCode Available	2	5
Knowledge distillation: A good teacher is patient and consistent	Jun 9, 2021	Image ClassificationKnowledge Distillation	CodeCode Available	2	5
JL1-CD: A New Benchmark for Remote Sensing Change Detection and a Robust Multi-Teacher Knowledge Distillation Framework	Feb 19, 2025	Change DetectionEarth Observation	CodeCode Available	2	5
Data-Free Knowledge Distillation for Deep Neural Networks	Oct 19, 2017	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available	2	5
2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds	Jul 10, 2022	3D Semantic SegmentationAutonomous Driving	CodeCode Available	2	5
Event Stream-based Visual Object Tracking: HDETrack V2 and A High-Definition Benchmark	Feb 8, 2025	Knowledge DistillationObject Tracking	CodeCode Available	2	5
ESP-MedSAM: Efficient Self-Prompting SAM for Universal Domain-Generalized Medical Image Segmentation	Jul 19, 2024	DecoderImage Segmentation	CodeCode Available	2	5
ConDistFL: Conditional Distillation for Federated Learning from Partially Annotated Data	Aug 8, 2023	Federated LearningKnowledge Distillation	CodeCode Available	2	5
Decoupled Knowledge Distillation	Mar 16, 2022	image-classificationImage Classification	CodeCode Available	2	5
Dual-Space Knowledge Distillation for Large Language Models	Jun 25, 2024	Instruction FollowingKnowledge Distillation	CodeCode Available	2	5
From Instance Training to Instruction Learning: Task Adapters Generation from Instructions	Jun 18, 2024	Knowledge Distillation	CodeCode Available	2	5
LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS	Nov 28, 2023	Knowledge DistillationNeRF	CodeCode Available	2	5
OBSeg: Accurate and Fast Instance Segmentation Framework Using Segmentation Foundation Models with Oriented Bounding Box Prompts	Jan 16, 2024	Amodal Instance SegmentationInstance Segmentation	CodeCode Available	2	5
Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss	Jan 5, 2024	Knowledge Distillation	CodeCode Available	2	5
A Deep Knowledge Distillation framework for EEG assisted enhancement of single-lead ECG based sleep staging	Dec 14, 2021	ECG based Sleep StagingEEG	CodeCode Available	1	5
Collaborative Distillation for Ultra-Resolution Universal Style Transfer	Mar 18, 2020	DecoderGPU	CodeCode Available	1	5
AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks	Jun 15, 2020	AutoMLKnowledge Distillation	CodeCode Available	1	5
Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks	Mar 25, 2022	Incremental LearningKnowledge Distillation	CodeCode Available	1	5
Coaching a Teachable Student	Jun 16, 2023	CARLA longest6Knowledge Distillation	CodeCode Available	1	5
COMEDIAN: Self-Supervised Learning and Knowledge Distillation for Action Spotting using Transformers	Sep 3, 2023	Action DetectionAction Spotting	CodeCode Available	1	5
CMDFusion: Bidirectional Fusion Network with Cross-modality Knowledge Distillation for LIDAR Semantic Segmentation	Jul 9, 2023	Autonomous VehiclesKnowledge Distillation	CodeCode Available	1	5
Adapt Your Teacher: Improving Knowledge Distillation for Exemplar-free Continual Learning	Aug 18, 2023	class-incremental learningClass Incremental Learning	CodeCode Available	1	5
CMD: Self-supervised 3D Action Representation Learning with Cross-modal Mutual Distillation	Aug 26, 2022	3D Action RecognitionAction Recognition	CodeCode Available	1	5
Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?	Dec 16, 2022	3D Point Cloud ClassificationFew-Shot 3D Point Cloud Classification	CodeCode Available	1	5
CLRKDNet: Speeding up Lane Detection with Knowledge Distillation	May 21, 2024	Autonomous DrivingKnowledge Distillation	CodeCode Available	1	5
Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation	Jan 25, 2024	ClusteringFederated Learning	CodeCode Available	1	5
CLIP model is an Efficient Continual Learner	Oct 6, 2022	Continual LearningIncremental Learning	CodeCode Available	1	5
Augmentation-Free Dense Contrastive Knowledge Distillation for Efficient Semantic Segmentation	Dec 7, 2023	Contrastive LearningData Augmentation	CodeCode Available	1	5
CL-LoRA: Continual Low-Rank Adaptation for Rehearsal-Free Class-Incremental Learning	May 30, 2025	class-incremental learningClass Incremental Learning	CodeCode Available	1	5
Understanding the Role of the Projector in Knowledge Distillation	Mar 20, 2023	image-classificationImage Classification	CodeCode Available	1	5
Adaptive Multi-Teacher Multi-level Knowledge Distillation	Mar 6, 2021	Knowledge Distillation	CodeCode Available	1	5
FocusNet: Classifying Better by Focusing on Confusing Classes	Oct 14, 2021	Classificationimage-classification	CodeCode Available	1	5
Adaptive Multi-Teacher Knowledge Distillation with Meta-Learning	Jun 11, 2023	Knowledge DistillationMeta-Learning	CodeCode Available	1	5
Audio Embeddings as Teachers for Music Classification	Jun 30, 2023	ClassificationInformation Retrieval	CodeCode Available	1	5
Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation Models	Feb 9, 2025	Audio-Visual Speech RecognitionAutomatic Speech Recognition	CodeCode Available	1	5

Show:10 25 50

← PrevPage 3 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified