Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 4240 papers

Title	Date	Tasks	Status	Hype	Score
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?	Nov 25, 2024	HallucinationKnowledge Distillation	CodeCode Available	7	5
Awesome Multi-modal Object Tracking	May 23, 2024	Autonomous DrivingKnowledge Distillation	CodeCode Available	5	5
A Survey on Knowledge Distillation of Large Language Models	Feb 20, 2024	Data AugmentationKnowledge Distillation	CodeCode Available	5	5
AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time	Nov 7, 2022	Knowledge DistillationMulti-Person Pose Estimation	CodeCode Available	5	5
MobileSAMv2: Faster Segment Anything to Everything	Dec 15, 2023	DecoderKnowledge Distillation	CodeCode Available	5	5
LLM Inference Unveiled: Survey and Roofline Model Insights	Feb 26, 2024	Knowledge DistillationLanguage Modelling	CodeCode Available	4	5
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation	Feb 16, 2024	Knowledge DistillationQuantization	CodeCode Available	4	5
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling	Nov 1, 2023	HallucinationKnowledge Distillation	CodeCode Available	4	5
SAMPart3D: Segment Any Part in 3D Objects	Nov 11, 2024	3D Generation3D Part Segmentation	CodeCode Available	4	5
Effective Whole-body Pose Estimation with Two-stages Distillation	Jul 29, 2023	2D Human Pose EstimationKnowledge Distillation	CodeCode Available	4	5
Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs	Feb 19, 2024	Knowledge Distillation	CodeCode Available	4	5
Vision-Language Models for Vision Tasks: A Survey	Apr 3, 2023	BenchmarkingKnowledge Distillation	CodeCode Available	4	5
Logit Standardization in Knowledge Distillation	Mar 3, 2024	Knowledge Distillation	CodeCode Available	3	5
Generalized Robot 3D Vision-Language Model with Fast Rendering and Pre-Training Vision-Language Alignment	Dec 1, 2023	Contrastive LearningFew-Shot Learning	CodeCode Available	3	5
Efficient Reasoning Models: A Survey	Apr 15, 2025	Knowledge DistillationModel Compression	CodeCode Available	3	5
Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation	Jun 11, 2024	DecoderKnowledge Distillation	CodeCode Available	3	5
PromptKD: Unsupervised Prompt Distillation for Vision-Language Models	Mar 5, 2024	Knowledge DistillationPrompt Engineering	CodeCode Available	3	5
DistiLLM: Towards Streamlined Distillation for Large Language Models	Feb 6, 2024	Instruction FollowingKnowledge Distillation	CodeCode Available	3	5
Compact Language Models via Pruning and Knowledge Distillation	Jul 19, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available	3	5
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification	Mar 13, 2022	Audio ClassificationKnowledge Distillation	CodeCode Available	3	5
Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised Models	Jun 23, 2025	Domain AdaptationGPU	CodeCode Available	3	5
Recurrent Drafter for Fast Speculative Decoding in Large Language Models	Mar 14, 2024	BenchmarkingKnowledge Distillation	CodeCode Available	3	5
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech	Jul 13, 2022	DenoisingGPU	CodeCode Available	3	5
A Survey on Inference Optimization Techniques for Mixture of Experts Models	Dec 18, 2024	Computational EfficiencyDistributed Computing	CodeCode Available	3	5
LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation	Aug 28, 2024	Computational EfficiencyHallucination	CodeCode Available	3	5
N-LTP: An Open-source Neural Language Technology Platform for Chinese	Sep 24, 2020	Chinese Word SegmentationDependency Parsing	CodeCode Available	3	5
Semi-Supervised Speech Recognition via Local Prior Matching	Feb 24, 2020	Knowledge DistillationLanguage Modeling	CodeCode Available	3	5
Focal Loss for Dense Object Detection	Aug 7, 2017	2D Object DetectionDense Object Detection	CodeCode Available	2	5
Anomaly Detection via Reverse Distillation from One-Class Embedding	Jan 26, 2022	Anomaly Classification	CodeCode Available	2	5
From Instance Training to Instruction Learning: Task Adapters Generation from Instructions	Jun 18, 2024	Knowledge Distillation	CodeCode Available	2	5
Event Stream-based Visual Object Tracking: A High-Resolution Benchmark Dataset and A Novel Baseline	Sep 26, 2023	Knowledge DistillationObject Tracking	CodeCode Available	2	5
Event Stream-based Visual Object Tracking: HDETrack V2 and A High-Definition Benchmark	Feb 8, 2025	Knowledge DistillationObject Tracking	CodeCode Available	2	5
Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference	Dec 15, 2023	DecoderDenoising	CodeCode Available	2	5
Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-Review	Oct 4, 2024	Knowledge DistillationLogical Reasoning	CodeCode Available	2	5
EPTQ: Enhanced Post-Training Quantization via Hessian-guided Network-wise Optimization	Sep 20, 2023	Knowledge Distillationobject-detection	CodeCode Available	2	5
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation	Nov 9, 2022	Audio ClassificationAudio Tagging	CodeCode Available	2	5
ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning	Mar 29, 2024	Continual LearningContinual Panoptic Segmentation	CodeCode Available	2	5
Efficient Multivariate Time Series Forecasting via Calibrated Language Models with Privileged Knowledge Distillation	May 4, 2025	Knowledge DistillationMultivariate Time Series Forecasting	CodeCode Available	2	5
ESP-MedSAM: Efficient Self-Prompting SAM for Universal Domain-Generalized Medical Image Segmentation	Jul 19, 2024	DecoderImage Segmentation	CodeCode Available	2	5
Distillation-Supervised Convolutional Low-Rank Adaptation for Efficient Image Super-Resolution	Apr 15, 2025	Image Super-ResolutionKnowledge Distillation	CodeCode Available	2	5
Diffusion Time-step Curriculum for One Image to 3D Generation	Apr 6, 2024	3D GenerationImage to 3D	CodeCode Available	2	5
Decoupled Knowledge Distillation	Mar 16, 2022	image-classificationImage Classification	CodeCode Available	2	5
Distillation-Free One-Step Diffusion for Real-World Image Super-Resolution	Oct 5, 2024	Image Super-ResolutionKnowledge Distillation	CodeCode Available	2	5
DOT: A Distillation-Oriented Trainer	Jul 17, 2023	Knowledge Distillation	CodeCode Available	2	5
ConDistFL: Conditional Distillation for Federated Learning from Partially Annotated Data	Aug 8, 2023	Federated LearningKnowledge Distillation	CodeCode Available	2	5
OBSeg: Accurate and Fast Instance Segmentation Framework Using Segmentation Foundation Models with Oriented Bounding Box Prompts	Jan 16, 2024	Amodal Instance SegmentationInstance Segmentation	CodeCode Available	2	5
Cross-Image Relational Knowledge Distillation for Semantic Segmentation	Apr 14, 2022	Knowledge DistillationSegmentation	CodeCode Available	2	5
CoLaDa: A Collaborative Label Denoising Framework for Cross-lingual Named Entity Recognition	May 24, 2023	DenoisingKnowledge Distillation	CodeCode Available	2	5
Data-Free Knowledge Distillation for Deep Neural Networks	Oct 19, 2017	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available	2	5
Dual-Space Knowledge Distillation for Large Language Models	Jun 25, 2024	Instruction FollowingKnowledge Distillation	CodeCode Available	2	5

Show:10 25 50

← PrevPage 1 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified