Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 651–700 of 4240 papers

Title	Date	Tasks	Status	Hype	Score
A Fast Knowledge Distillation Framework for Visual Recognition	Dec 2, 2021	image-classificationImage Classification	CodeCode Available	1	5
Distilling Holistic Knowledge with Graph Neural Networks	Aug 12, 2021	Knowledge Distillation	CodeCode Available	1	5
Contrastive Distillation on Intermediate Representations for Language Model Compression	Sep 29, 2020	Knowledge DistillationLanguage Modeling	CodeCode Available	1	5
Distilling Image Classifiers in Object Detectors	Jun 9, 2021	Knowledge DistillationObject	CodeCode Available	1	5
Geometer: Graph Few-Shot Class-Incremental Learning via Prototype Representation	May 27, 2022	class-incremental learningClass Incremental Learning	CodeCode Available	1	5
Contrastive Representation Distillation	Oct 23, 2019	Contrastive LearningKnowledge Distillation	CodeCode Available	1	5
Prototype-based Incremental Few-Shot Semantic Segmentation	Nov 30, 2020	Few-Shot Semantic SegmentationIncremental Learning	CodeCode Available	1	5
Distilling Knowledge from Ensembles of Acoustic Models for Joint CTC-Attention End-to-End Speech Recognition	May 19, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1	5
Contrastive Deep Supervision	Jul 12, 2022	Contrastive LearningFine-Grained Image Classification	CodeCode Available	1	5
COMEDIAN: Self-Supervised Learning and Knowledge Distillation for Action Spotting using Transformers	Sep 3, 2023	Action DetectionAction Spotting	CodeCode Available	1	5
Distilling Knowledge from Self-Supervised Teacher by Embedding Graph Alignment	Nov 23, 2022	Knowledge DistillationRepresentation Learning	CodeCode Available	1	5
Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation	Jan 25, 2024	ClusteringFederated Learning	CodeCode Available	1	5
Distilling Linguistic Context for Language Model Compression	Sep 17, 2021	Knowledge DistillationLanguage Modeling	CodeCode Available	1	5
Distilling Large Vision-Language Model with Out-of-Distribution Generalizability	Jul 6, 2023	Few-Shot Image ClassificationImage Classification	CodeCode Available	1	5
GenFormer -- Generated Images are All You Need to Improve Robustness of Transformers on Small Datasets	Aug 26, 2024	AllData Augmentation	CodeCode Available	1	5
Distilling Meta Knowledge on Heterogeneous Graph for Illicit Drug Trafficker Detection on Social Media	Dec 1, 2021	Knowledge DistillationMarketing	CodeCode Available	1	5
Geometric Knowledge Distillation: Topology Compression for Graph Neural Networks	Oct 24, 2022	Knowledge DistillationTransfer Learning	CodeCode Available	1	5
Distilling Object Detectors via Decoupled Features	Mar 26, 2021	image-classificationImage Classification	CodeCode Available	1	5
Distilling Object Detectors with Feature Richness	Nov 1, 2021	Knowledge DistillationModel Compression	CodeCode Available	1	5
Knowledge Distillation via Route Constrained Optimization	Apr 19, 2019	Face RecognitionKnowledge Distillation	CodeCode Available	1	5
Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks	Oct 30, 2017	3D Action RecognitionAction Recognition	CodeCode Available	1	5
Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in Knowledge Distillation	May 19, 2021	Image ClassificationKnowledge Distillation	CodeCode Available	1	5
Continual Learning for LiDAR Semantic Segmentation: Class-Incremental and Coarse-to-Fine strategies on Sparse Data	Apr 8, 2023	class-incremental learningClass Incremental Learning	CodeCode Available	1	5
Distilling the Knowledge in a Neural Network	Mar 9, 2015	Knowledge DistillationMixture-of-Experts	CodeCode Available	1	5
Complementary Relation Contrastive Distillation	Mar 29, 2021	Knowledge DistillationRelation	CodeCode Available	1	5
LabelDistill: Label-guided Cross-modal Knowledge Distillation for Camera-based 3D Object Detection	Jul 14, 2024	3D Object DetectionDepth Estimation	CodeCode Available	1	5
Can LLM Watermarks Robustly Prevent Unauthorized Knowledge Distillation?	Feb 17, 2025	Knowledge DistillationLanguage Modeling	CodeCode Available	1	5
Label Poisoning is All You Need	Oct 29, 2023	AllBackdoor Attack	CodeCode Available	1	5
Generative Bias for Robust Visual Question Answering	Aug 1, 2022	Knowledge DistillationQuestion Answering	CodeCode Available	1	5
A Discrepancy Aware Framework for Robust Anomaly Detection	Oct 11, 2023	Anomaly DetectionDecoder	CodeCode Available	1	5
Continual Learning for Image Segmentation with Dynamic Query	Nov 29, 2023	Continual LearningDiversity	CodeCode Available	1	5
Comprehensive Knowledge Distillation with Causal Intervention	Dec 1, 2021	Causal InferenceKnowledge Distillation	CodeCode Available	1	5
Distill on the Go: Online knowledge distillation in self-supervised learning	Apr 20, 2021	Knowledge DistillationSelf-Supervised Learning	CodeCode Available	1	5
Distill the Image to Nowhere: Inversion Knowledge Distillation for Multimodal Machine Translation	Oct 10, 2022	Knowledge DistillationMachine Translation	CodeCode Available	1	5
DistilProtBert: A distilled protein language model used to distinguish between real proteins and their randomly shuffled counterparts	May 10, 2022	Dimensionality ReductionKnowledge Distillation	CodeCode Available	1	5
DistilPose: Tokenized Pose Regression with Heatmap Distillation	Mar 4, 2023	Knowledge DistillationPose Estimation	CodeCode Available	1	5
Generative Model-based Feature Knowledge Distillation for Action Recognition	Dec 14, 2023	Action DetectionAction Recognition	CodeCode Available	1	5
AgeFlow: Conditional Age Progression and Regression with Normalizing Flows	May 15, 2021	AttributeKnowledge Distillation	CodeCode Available	1	5
Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors	May 28, 2022	Domain AdaptationKnowledge Distillation	CodeCode Available	1	5
Distributed Dynamic Map Fusion via Federated Learning for Intelligent Networked Vehicles	Mar 5, 2021	Federated LearningKnowledge Distillation	CodeCode Available	1	5
ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via α-β-Divergence	May 7, 2025	Knowledge Distillation	CodeCode Available	1	5
DisWOT: Student Architecture Search for Distillation WithOut Training	Mar 28, 2023	Knowledge Distillation	CodeCode Available	1	5
Distribution-aware Knowledge Prototyping for Non-exemplar Lifelong Person Re-identification	Jan 1, 2024	DiversityKnowledge Distillation	CodeCode Available	1	5
Distribution-aware Forgetting Compensation for Exemplar-Free Lifelong Person Re-identification	Apr 21, 2025	Exemplar-FreeKnowledge Distillation	CodeCode Available	1	5
Continual evaluation for lifelong learning: Identifying the stability gap	May 26, 2022	Continual LearningIncremental Learning	CodeCode Available	1	5
DKDL-Net: A Lightweight Bearing Fault Detection Model via Decoupled Knowledge Distillation and Low-Rank Adaptation Fine-tuning	Jun 10, 2024	Fault DetectionFault Diagnosis	CodeCode Available	1	5
DM-VTON: Distilled Mobile Real-time Virtual Try-On	Aug 26, 2023	GPUHuman Parsing	CodeCode Available	1	5
Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup	Dec 17, 2020	InformativenessKnowledge Distillation	CodeCode Available	1	5
DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval	Jun 24, 2021	Computational EfficiencyKnowledge Distillation	CodeCode Available	1	5
Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing	Apr 1, 2020	Knowledge DistillationRetrieval	CodeCode Available	1	5

Show:10 25 50

← PrevPage 14 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified