Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 551–600 of 4240 papers

Title	Date	Tasks	Status	Hype
Decomposed Knowledge Distillation for Class-Incremental Semantic Segmentation	Oct 12, 2022	Class-Incremental Semantic SegmentationKnowledge Distillation	CodeCode Available	1
Dialogue Chain-of-Thought Distillation for Commonsense-aware Conversational Agents	Oct 13, 2023	InformativenessKnowledge Distillation	CodeCode Available	1
Categorical Relation-Preserving Contrastive Knowledge Distillation for Medical Image Classification	Jul 7, 2021	Classificationimage-classification	CodeCode Available	1
Dice Semimetric Losses: Optimizing the Dice Score with Soft Labels	Mar 28, 2023	Knowledge Distillation	CodeCode Available	1
Anti-Distillation Backdoor Attacks: Backdoors Can Really Survive in Knowledge Distillation	Oct 24, 2021	Backdoor AttackKnowledge Distillation	CodeCode Available	1
DiGA: Distil to Generalize and then Adapt for Domain Adaptive Semantic Segmentation	Apr 5, 2023	Data AugmentationKnowledge Distillation	CodeCode Available	1
Extending global-local view alignment for self-supervised learning with remote sensing imagery	Mar 12, 2023	Change DetectionContrastive Learning	CodeCode Available	1
DIOD: Self-Distillation Meets Object Discovery	Jan 1, 2024	Instance SegmentationKnowledge Distillation	CodeCode Available	1
DA-Mamba: Domain Adaptive Hybrid Mamba-Transformer Based One-Stage Object Detection	Feb 16, 2025	Domain AdaptationKnowledge Distillation	CodeCode Available	1
CCL: Continual Contrastive Learning for LiDAR Place Recognition	Mar 24, 2023	Autonomous DrivingContinual Learning	CodeCode Available	1
CLIP-KD: An Empirical Study of CLIP Model Distillation	Jul 24, 2023	Contrastive LearningCross-Modal Retrieval	CodeCode Available	1
CLIP model is an Efficient Continual Learner	Oct 6, 2022	Continual LearningIncremental Learning	CodeCode Available	1
Class-relation Knowledge Distillation for Novel Class Discovery	Jul 18, 2023	Knowledge DistillationNovel Class Discovery	CodeCode Available	1
Distilling Knowledge from Graph Convolutional Networks	Mar 23, 2020	Knowledge DistillationTransfer Learning	CodeCode Available	1
CEKD: Cross-Modal Edge-Privileged Knowledge Distillation for Semantic Scene Understanding Using Only Thermal Images	Feb 22, 2023	Knowledge DistillationScene Understanding	CodeCode Available	1
CEN-HDR: Computationally Efficient neural Network for real-time High Dynamic Range imaging	Feb 10, 2023	Efficient Neural NetworkKnowledge Distillation	CodeCode Available	1
Tracking-by-Trackers with a Distilled and Reinforced Model	Jul 8, 2020	Knowledge DistillationObject Tracking	CodeCode Available	1
Distillation Matters: Empowering Sequential Recommenders to Match the Performance of Large Language Model	May 1, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available	1
DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation	Sep 26, 2023	3D Object DetectionAutonomous Driving	CodeCode Available	1
Channel-Aware Distillation Transformer for Depth Estimation on Nano Drones	Mar 18, 2023	Autonomous NavigationDepth Estimation	CodeCode Available	1
Channel Distillation: Channel-Wise Attention for Knowledge Distillation	Jun 2, 2020	Knowledge Distillation	CodeCode Available	1
Distilling a Powerful Student Model via Online Knowledge Distillation	Mar 26, 2021	Knowledge Distillation	CodeCode Available	1
Channel Gating Neural Networks	May 29, 2018	Knowledge DistillationNetwork Pruning	CodeCode Available	1
Distilling Autoregressive Models to Obtain High-Performance Non-Autoregressive Solvers for Vehicle Routing Problems with Faster Inference Speed	Dec 19, 2023	Knowledge Distillation	CodeCode Available	1
Data-Free Network Quantization With Adversarial Knowledge Distillation	May 8, 2020	Knowledge DistillationModel Compression	CodeCode Available	1
Channel-wise Knowledge Distillation for Dense Prediction	Nov 26, 2020	Knowledge DistillationPrediction	CodeCode Available	1
Distilling Knowledge from Reader to Retriever for Question Answering	Dec 8, 2020	Information RetrievalKnowledge Distillation	CodeCode Available	1
CheXseg: Combining Expert Annotations with DNN-generated Saliency Maps for X-ray Segmentation	Feb 21, 2021	Image SegmentationKnowledge Distillation	CodeCode Available	1
Chinese grammatical error correction based on knowledge distillation	Jul 31, 2022	Grammatical Error CorrectionKnowledge Distillation	CodeCode Available	1
Distilling Knowledge via Intermediate Classifiers	Feb 28, 2021	Knowledge DistillationTransfer Learning	CodeCode Available	1
Circumventing Outliers of AutoAugment with Knowledge Distillation	Mar 25, 2020	Data AugmentationGeneral Classification	CodeCode Available	1
Distilling Large Vision-Language Model with Out-of-Distribution Generalizability	Jul 6, 2023	Few-Shot Image ClassificationImage Classification	CodeCode Available	1
FocusNet: Classifying Better by Focusing on Confusing Classes	Oct 14, 2021	Classificationimage-classification	CodeCode Available	1
Distilling Object Detectors with Feature Richness	Nov 1, 2021	Knowledge DistillationModel Compression	CodeCode Available	1
Distilling Script Knowledge from Large Language Models for Constrained Language Planning	May 9, 2023	Knowledge Distillation	CodeCode Available	1
Class Attention Transfer Based Knowledge Distillation	Apr 25, 2023	Knowledge DistillationModel Compression	CodeCode Available	1
Advancing Pre-trained Teacher: Towards Robust Feature Discrepancy for Anomaly Detection	May 3, 2024	Anomaly DetectionAttribute	CodeCode Available	1
Class-Balanced Distillation for Long-Tailed Visual Recognition	Apr 12, 2021	Image ClassificationKnowledge Distillation	CodeCode Available	1
Decoupled Kullback-Leibler Divergence Loss	May 23, 2023	Adversarial DefenseAdversarial Robustness	CodeCode Available	1
Distill on the Go: Online knowledge distillation in self-supervised learning	Apr 20, 2021	Knowledge DistillationSelf-Supervised Learning	CodeCode Available	1
Data-Free Knowledge Distillation for Heterogeneous Federated Learning	May 20, 2021	Data-free Knowledge DistillationFederated Learning	CodeCode Available	1
Cloud Object Detector Adaptation by Integrating Different Source Knowledge	Dec 10, 2024	Domain AdaptationKnowledge Distillation	CodeCode Available	1
Advantage-Guided Distillation for Preference Alignment in Small Language Models	Feb 25, 2025	Knowledge Distillation	CodeCode Available	1
DistilVPR: Cross-Modal Knowledge Distillation for Visual Place Recognition	Dec 17, 2023	Knowledge DistillationVisual Place Recognition	CodeCode Available	1
3D Annotation-Free Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving	May 24, 2024	Autonomous DrivingKnowledge Distillation	CodeCode Available	1
Distributed Dynamic Map Fusion via Federated Learning for Intelligent Networked Vehicles	Mar 5, 2021	Federated LearningKnowledge Distillation	CodeCode Available	1
Class-Incremental Learning by Knowledge Distillation with Adaptive Feature Consolidation	Apr 2, 2022	class-incremental learningClass Incremental Learning	CodeCode Available	1
Distribution-aware Forgetting Compensation for Exemplar-Free Lifelong Person Re-identification	Apr 21, 2025	Exemplar-FreeKnowledge Distillation	CodeCode Available	1
Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method	Jun 11, 2023	Knowledge DistillationLanguage Modeling	CodeCode Available	1
Data-Free Class-Incremental Hand Gesture Recognition	Jan 1, 2023	class-incremental learningClass Incremental Learning	CodeCode Available	1

Show:10 25 50

← PrevPage 12 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified