Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 4240 papers

Title	Date	Tasks	Status	Hype
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition	Jan 19, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	2
Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models	Jul 7, 2024	class-incremental learningClass Incremental Learning	CodeCode Available	2
MobileFaceSwap: A Lightweight Framework for Video Face Swapping	Jan 11, 2022	Face SwappingKnowledge Distillation	CodeCode Available	2
Learning an Adaptive and View-Invariant Vision Transformer for Real-Time UAV Tracking	Dec 28, 2024	Knowledge DistillationVisual Tracking	CodeCode Available	2
LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization	Mar 11, 2025	GPUImage Generation	CodeCode Available	2
OccDepth: A Depth-Aware Method for 3D Semantic Scene Completion	Feb 27, 2023	3D geometry3D Semantic Scene Completion	CodeCode Available	2
BiM-VFI: directional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions	Dec 16, 2024	Knowledge DistillationMotion Estimation	CodeCode Available	2
On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving	Mar 2, 2024	Autonomous DrivingKnowledge Distillation	CodeCode Available	2
A Cognitive-Based Trajectory Prediction Approach for Autonomous Driving	Feb 29, 2024	Autonomous DrivingDecision Making	CodeCode Available	2
Knowledge distillation: A good teacher is patient and consistent	Jun 9, 2021	Image ClassificationKnowledge Distillation	CodeCode Available	2
Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners	Apr 2, 2024	class-incremental learningClass Incremental Learning	CodeCode Available	2
Incremental Sequence Labeling: A Tale of Two Shifts	Feb 16, 2024	Incremental LearningKnowledge Distillation	CodeCode Available	2
Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks	Apr 13, 2020	Knowledge DistillationModel Compression	CodeCode Available	2
From Instance Training to Instruction Learning: Task Adapters Generation from Instructions	Jun 18, 2024	Knowledge Distillation	CodeCode Available	2
Focal Loss for Dense Object Detection	Aug 7, 2017	2D Object DetectionDense Object Detection	CodeCode Available	2
Improving the Training of Rectified Flows	May 30, 2024	Image GenerationKnowledge Distillation	CodeCode Available	2
Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection	Mar 14, 2024	Knowledge DistillationNovel Object Detection	CodeCode Available	2
LightGNN: Simple Graph Neural Network for Recommendation	Jan 6, 2025	Computational EfficiencyGraph Neural Network	CodeCode Available	2
Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-Review	Oct 4, 2024	Knowledge DistillationLogical Reasoning	CodeCode Available	2
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation	Nov 9, 2022	Audio ClassificationAudio Tagging	CodeCode Available	2
EPTQ: Enhanced Post-Training Quantization via Hessian-guided Network-wise Optimization	Sep 20, 2023	Knowledge Distillationobject-detection	CodeCode Available	2
Dual-Space Knowledge Distillation for Large Language Models	Jun 25, 2024	Instruction FollowingKnowledge Distillation	CodeCode Available	2
DOT: A Distillation-Oriented Trainer	Jul 17, 2023	Knowledge Distillation	CodeCode Available	2
Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models	Oct 24, 2023	Audio ClassificationAudio Tagging	CodeCode Available	2
ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning	Mar 29, 2024	Continual LearningContinual Panoptic Segmentation	CodeCode Available	2
ESP-MedSAM: Efficient Self-Prompting SAM for Universal Domain-Generalized Medical Image Segmentation	Jul 19, 2024	DecoderImage Segmentation	CodeCode Available	2
Efficient Multivariate Time Series Forecasting via Calibrated Language Models with Privileged Knowledge Distillation	May 4, 2025	Knowledge DistillationMultivariate Time Series Forecasting	CodeCode Available	2
Decoupled Knowledge Distillation	Mar 16, 2022	image-classificationImage Classification	CodeCode Available	2
Event Stream-based Visual Object Tracking: A High-Resolution Benchmark Dataset and A Novel Baseline	Sep 26, 2023	Knowledge DistillationObject Tracking	CodeCode Available	2
Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference	Dec 15, 2023	DecoderDenoising	CodeCode Available	2
A Comprehensive Survey on Knowledge Distillation	Mar 15, 2025	Knowledge DistillationSurvey	CodeCode Available	2
Diffusion Time-step Curriculum for One Image to 3D Generation	Apr 6, 2024	3D GenerationImage to 3D	CodeCode Available	2
Anomaly Detection via Reverse Distillation from One-Class Embedding	Jan 26, 2022	Anomaly Classification	CodeCode Available	2
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation	Jul 3, 2024	Domain GeneralizationKnowledge Distillation	CodeCode Available	2
Are Large Kernels Better Teachers than Transformers for ConvNets?	May 30, 2023	Knowledge Distillation	CodeCode Available	2
JL1-CD: A New Benchmark for Remote Sensing Change Detection and a Robust Multi-Teacher Knowledge Distillation Framework	Feb 19, 2025	Change DetectionEarth Observation	CodeCode Available	2
Cross-Image Relational Knowledge Distillation for Semantic Segmentation	Apr 14, 2022	Knowledge DistillationSegmentation	CodeCode Available	2
MiniLLM: Knowledge Distillation of Large Language Models	Jun 14, 2023	Instruction FollowingKnowledge Distillation	CodeCode Available	2
ConDistFL: Conditional Distillation for Federated Learning from Partially Annotated Data	Aug 8, 2023	Federated LearningKnowledge Distillation	CodeCode Available	2
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future	Jul 18, 2023	Knowledge Distillationobject-detection	CodeCode Available	2
A Unified Framework for 3D Scene Understanding	Jul 3, 2024	Contrastive LearningKnowledge Distillation	CodeCode Available	2
Learning Occlusion-Robust Vision Transformers for Real-Time UAV Tracking	Apr 12, 2025	Knowledge Distillation	CodeCode Available	2
Let Images Give You More:Point Cloud Cross-Modal Training for Shape Analysis	Oct 9, 2022	3D Point Cloud ClassificationKnowledge Distillation	CodeCode Available	2
LibFewShot: A Comprehensive Library for Few-shot Learning	Sep 10, 2021	Data AugmentationFew-Shot Image Classification	CodeCode Available	2
Data-Free Knowledge Distillation for Deep Neural Networks	Oct 19, 2017	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available	2
Distillation-Free One-Step Diffusion for Real-World Image Super-Resolution	Oct 5, 2024	Image Super-ResolutionKnowledge Distillation	CodeCode Available	2
CoLaDa: A Collaborative Label Denoising Framework for Cross-lingual Named Entity Recognition	May 24, 2023	DenoisingKnowledge Distillation	CodeCode Available	2
Positive-Unlabeled Compression on the Cloud	Sep 21, 2019	GPUKnowledge Distillation	CodeCode Available	2
Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study	Jun 20, 2024	In-Context LearningKnowledge Distillation	CodeCode Available	2
2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds	Jul 10, 2022	3D Semantic SegmentationAutonomous Driving	CodeCode Available	2

Show:10 25 50

← PrevPage 2 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified