Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–150 of 4240 papers

Title	Date	Tasks	Status	Hype
MI-GAN: A Simple Baseline for Image Inpainting on Mobile Devices	Jan 1, 2023	Efficient Neural NetworkImage Inpainting	CodeCode Available	2
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation	Nov 9, 2022	Audio ClassificationAudio Tagging	CodeCode Available	2
SSDA-YOLO: Semi-supervised Domain Adaptive YOLO for Cross-Domain Object Detection	Nov 4, 2022	Domain AdaptationKnowledge Distillation	CodeCode Available	2
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform	Oct 28, 2022	CPUKnowledge Distillation	CodeCode Available	2
Let Images Give You More:Point Cloud Cross-Modal Training for Shape Analysis	Oct 9, 2022	3D Point Cloud ClassificationKnowledge Distillation	CodeCode Available	2
On-Device Domain Generalization	Sep 15, 2022	Data AugmentationDomain Generalization	CodeCode Available	2
2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds	Jul 10, 2022	3D Semantic SegmentationAutonomous Driving	CodeCode Available	2
MetaFed: Federated Learning among Federations with Cyclic Knowledge Distillation for Personalized Healthcare	Jun 17, 2022	Federated LearningKnowledge Distillation	CodeCode Available	2
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers	Jun 4, 2022	Knowledge DistillationQuantization	CodeCode Available	2
Masked Generative Distillation	May 3, 2022	image-classificationImage Classification	CodeCode Available	2
Cross-Image Relational Knowledge Distillation for Semantic Segmentation	Apr 14, 2022	Knowledge DistillationSegmentation	CodeCode Available	2
Localization Distillation for Object Detection	Apr 12, 2022	Knowledge DistillationObject	CodeCode Available	2
Solving ImageNet: a Unified Scheme for Training any Backbone to Top Results	Apr 7, 2022	Image ClassificationKnowledge Distillation	CodeCode Available	2
Nix-TTS: Lightweight and End-to-End Text-to-Speech via Module-wise Distillation	Mar 29, 2022	CPUDecoder	CodeCode Available	2
Decoupled Knowledge Distillation	Mar 16, 2022	image-classificationImage Classification	CodeCode Available	2
Tiny Object Tracking: A Large-scale Dataset and A Baseline	Feb 11, 2022	AttributeKnowledge Distillation	CodeCode Available	2
Anomaly Detection via Reverse Distillation from One-Class Embedding	Jan 26, 2022	Anomaly Classification	CodeCode Available	2
MobileFaceSwap: A Lightweight Framework for Video Face Swapping	Jan 11, 2022	Face SwappingKnowledge Distillation	CodeCode Available	2
LibFewShot: A Comprehensive Library for Few-shot Learning	Sep 10, 2021	Data AugmentationFew-Shot Image Classification	CodeCode Available	2
Semi-Supervised Domain Generalizable Person Re-Identification	Aug 11, 2021	Generalizable Person Re-identificationKnowledge Distillation	CodeCode Available	2
Learning Student Networks in the Wild	Jun 19, 2021	Knowledge DistillationModel Compression	CodeCode Available	2
Knowledge distillation: A good teacher is patient and consistent	Jun 9, 2021	Image ClassificationKnowledge Distillation	CodeCode Available	2
Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks	Apr 13, 2020	Knowledge DistillationModel Compression	CodeCode Available	2
TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing	Feb 28, 2020	Knowledge DistillationReading Comprehension	CodeCode Available	2
Scalable Zero-shot Entity Linking with Dense Entity Retrieval	Nov 10, 2019	Entity EmbeddingsEntity Linking	CodeCode Available	2
Positive-Unlabeled Compression on the Cloud	Sep 21, 2019	GPUKnowledge Distillation	CodeCode Available	2
Well-Read Students Learn Better: On the Importance of Pre-training Compact Models	Aug 23, 2019	Knowledge DistillationLanguage Modelling	CodeCode Available	2
Data-Free Knowledge Distillation for Deep Neural Networks	Oct 19, 2017	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available	2
Focal Loss for Dense Object Detection	Aug 7, 2017	2D Object DetectionDense Object Detection	CodeCode Available	2
SeqPE: Transformer with Sequential Position Encoding	Jun 16, 2025	image-classificationImage Classification	CodeCode Available	1
SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning	Jun 10, 2025	Knowledge DistillationMath	CodeCode Available	1
CL-LoRA: Continual Low-Rank Adaptation for Rehearsal-Free Class-Incremental Learning	May 30, 2025	class-incremental learningClass Incremental Learning	CodeCode Available	1
UWSAM: Segment Anything Model Guided Underwater Instance Segmentation and A Large-scale Benchmark Dataset	May 21, 2025	Instance SegmentationKnowledge Distillation	CodeCode Available	1
Deliberation on Priors: Trustworthy Reasoning of Large Language Models on Knowledge Graphs	May 21, 2025	Knowledge DistillationKnowledge Graphs	CodeCode Available	1
DeepKD: A Deeply Decoupled and Denoised Knowledge Distillation Trainer	May 21, 2025	DenoisingKnowledge Distillation	CodeCode Available	1
A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone	May 19, 2025	Knowledge DistillationTransfer Learning	CodeCode Available	1
Always Clear Depth: Robust Monocular Depth Estimation under Adverse Weather	May 18, 2025	Autonomous DrivingDepth Estimation	CodeCode Available	1
Foundation Models Knowledge Distillation For Battery Capacity Degradation Forecast	May 13, 2025	Knowledge DistillationTime Series	CodeCode Available	1
ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via α-β-Divergence	May 7, 2025	Knowledge Distillation	CodeCode Available	1
Distribution-aware Forgetting Compensation for Exemplar-Free Lifelong Person Re-identification	Apr 21, 2025	Exemplar-FreeKnowledge Distillation	CodeCode Available	1
Teach Me How to Denoise: A Universal Framework for Denoising Multi-modal Recommender Systems via Guided Calibration	Apr 19, 2025	DenoisingKnowledge Distillation	CodeCode Available	1
A Dual-Space Framework for General Knowledge Distillation of Large Language Models	Apr 15, 2025	Code GenerationGeneral Knowledge	CodeCode Available	1
Better Estimation of the KL Divergence Between Language Models	Apr 14, 2025	Knowledge Distillation	CodeCode Available	1
Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking	Apr 4, 2025	Document RankingInformation Retrieval	CodeCode Available	1
Multi-modal Knowledge Distillation-based Human Trajectory Forecasting	Mar 28, 2025	Autonomous DrivingKnowledge Distillation	CodeCode Available	1
High Temporal Consistency through Semantic Similarity Propagation in Semi-Supervised Video Semantic Segmentation for Autonomous Flight	Mar 19, 2025	Image SegmentationKnowledge Distillation	CodeCode Available	1
Exploring Performance-Complexity Trade-Offs in Sound Event Detection Models	Mar 14, 2025	Audio TaggingEvent Detection	CodeCode Available	1
Spatial Distillation based Distribution Alignment (SDDA) for Cross-Headset EEG Classification	Mar 7, 2025	Brain Computer InterfaceDomain Adaptation	CodeCode Available	1
Semantic Shift Estimation via Dual-Projection and Classifier Reconstruction for Exemplar-Free Class-Incremental Learning	Mar 7, 2025	class-incremental learningClass Incremental Learning	CodeCode Available	1
Advantage-Guided Distillation for Preference Alignment in Small Language Models	Feb 25, 2025	Knowledge Distillation	CodeCode Available	1

Show:10 25 50

← PrevPage 3 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified