Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 501–550 of 4240 papers

Title	Date	Tasks	Status	Hype	Score
Learning Compatible Embeddings	Aug 4, 2021	Knowledge DistillationRetrieval	CodeCode Available	1	5
Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with Foundation Models	May 15, 2023	3D Object DetectionImage Captioning	CodeCode Available	1	5
Advantage-Guided Distillation for Preference Alignment in Small Language Models	Feb 25, 2025	Knowledge Distillation	CodeCode Available	1	5
Dynamic Knowledge Distillation for Pre-trained Language Models	Sep 23, 2021	Knowledge Distillation	CodeCode Available	1	5
Dynamic Temperature Knowledge Distillation	Apr 19, 2024	Knowledge Distillation	CodeCode Available	1	5
Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification	Jan 6, 2020	General ClassificationKnowledge Distillation	CodeCode Available	1	5
A Deep Knowledge Distillation framework for EEG assisted enhancement of single-lead ECG based sleep staging	Dec 14, 2021	ECG based Sleep StagingEEG	CodeCode Available	1	5
EasyST: A Simple Framework for Spatio-Temporal Prediction	Sep 10, 2024	Knowledge DistillationPrediction	CodeCode Available	1	5
EchoDFKD: Data-Free Knowledge Distillation for Cardiac Ultrasound Segmentation using Synthetic Data	Sep 11, 2024	Data-free Knowledge DistillationKnowledge Distillation	CodeCode Available	1	5
APSNet: Attention Based Point Cloud Sampling	Oct 11, 2022	3D Point Cloud ClassificationKnowledge Distillation	CodeCode Available	1	5
Context-Aware Image Inpainting with Learned Semantic Priors	Jun 14, 2021	Image InpaintingKnowledge Distillation	CodeCode Available	1	5
FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction	Feb 16, 2022	Active LearningKnowledge Distillation	CodeCode Available	1	5
Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells	Oct 25, 2018	Depth EstimationDepth Prediction	CodeCode Available	1	5
Consistent Representation Learning for Continual Relation Extraction	Mar 5, 2022	Continual Relation ExtractionContrastive Learning	CodeCode Available	1	5
Advancing Pre-trained Teacher: Towards Robust Feature Discrepancy for Anomaly Detection	May 3, 2024	Anomaly DetectionAttribute	CodeCode Available	1	5
ConStyle v2: A Strong Prompter for All-in-One Image Restoration	Jun 26, 2024	AllGPU	CodeCode Available	1	5
Designing Large Foundation Models for Efficient Training and Inference: A Survey	Sep 3, 2024	Knowledge DistillationModel Compression	CodeCode Available	1	5
Consensual Collaborative Training And Knowledge Distillation Based Facial Expression Recognition Under Noisy Annotations	Jul 10, 2021	Facial Expression RecognitionFacial Expression Recognition (FER)	CodeCode Available	1	5
Extract the Knowledge of Graph Neural Networks and Go Beyond it: An Effective Knowledge Distillation Framework	Mar 4, 2021	Knowledge DistillationNode Classification	CodeCode Available	1	5
The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from a Single Image	Dec 1, 2021	Knowledge Distillation	CodeCode Available	1	5
Conformer and Blind Noisy Students for Improved Image Quality Assessment	Apr 27, 2022	Image Quality AssessmentImage Restoration	CodeCode Available	1	5
CoNMix for Source-free Single and Multi-target Domain Adaptation	Nov 7, 2022	Domain AdaptationKnowledge Distillation	CodeCode Available	1	5
Continual Learning for Image Segmentation with Dynamic Query	Nov 29, 2023	Continual LearningDiversity	CodeCode Available	1	5
ConNER: Consistency Training for Cross-lingual Named Entity Recognition	Nov 17, 2022	Cross-Lingual NERKnowledge Distillation	CodeCode Available	1	5
A Contrastive Distillation Approach for Incremental Semantic Segmentation in Aerial Images	Dec 7, 2021	image-classificationImage Classification	CodeCode Available	1	5
Content-Aware GAN Compression	Apr 6, 2021	Image GenerationImage Manipulation	CodeCode Available	1	5
Cross-Layer Distillation with Semantic Calibration	Dec 6, 2020	Knowledge DistillationTransfer Learning	CodeCode Available	1	5
FairDistillation: Mitigating Stereotyping in Language Models	Jul 10, 2022	Knowledge Distillation	CodeCode Available	1	5
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech	Jun 8, 2020	Knowledge DistillationSpeech Synthesis	CodeCode Available	1	5
FerKD: Surgical Label Adaptation for Efficient Distillation	Dec 29, 2023	Knowledge Distillation	CodeCode Available	1	5
Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup	Dec 17, 2020	InformativenessKnowledge Distillation	CodeCode Available	1	5
Compressing Deep Graph Neural Networks via Adversarial Knowledge Distillation	May 24, 2022	Graph ClassificationKnowledge Distillation	CodeCode Available	1	5
Exploring Extreme Parameter Compression for Pre-trained Language Models	May 20, 2022	Knowledge DistillationTensor Decomposition	CodeCode Available	1	5
Comprehensive Knowledge Distillation with Causal Intervention	Dec 1, 2021	Causal InferenceKnowledge Distillation	CodeCode Available	1	5
A Dual-Space Framework for General Knowledge Distillation of Large Language Models	Apr 15, 2025	Code GenerationGeneral Knowledge	CodeCode Available	1	5
ConcealGS: Concealing Invisible Copyright Information in 3D Gaussian Splatting	Jan 7, 2025	3D ReconstructionKnowledge Distillation	CodeCode Available	1	5
Exploring Inter-Channel Correlation for Diversity-Preserved Knowledge Distillation	Jan 1, 2021	DiversityKnowledge Distillation	CodeCode Available	1	5
Complementary Relation Contrastive Distillation	Mar 29, 2021	Knowledge DistillationRelation	CodeCode Available	1	5
Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning	Mar 1, 2021	Few-Shot Image ClassificationFew-Shot Learning	CodeCode Available	1	5
Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in Knowledge Distillation	May 19, 2021	Image ClassificationKnowledge Distillation	CodeCode Available	1	5
Exploring Deeper! Segment Anything Model with Depth Perception for Camouflaged Object Detection	Jul 17, 2024	Knowledge Distillationobject-detection	CodeCode Available	1	5
Exploring Inter-Channel Correlation for Diversity-preserved KnowledgeDistillation	Feb 8, 2022	DiversityKnowledge Distillation	CodeCode Available	1	5
Anti-Distillation Backdoor Attacks: Backdoors Can Really Survive in Knowledge Distillation	Oct 24, 2021	Backdoor AttackKnowledge Distillation	CodeCode Available	1	5
Even your Teacher Needs Guidance: Ground-Truth Targets Dampen Regularization Imposed by Self-Distillation	Feb 25, 2021	Knowledge DistillationSelf-Knowledge Distillation	CodeCode Available	1	5
Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation	Jan 25, 2024	ClusteringFederated Learning	CodeCode Available	1	5
COMEDIAN: Self-Supervised Learning and Knowledge Distillation for Action Spotting using Transformers	Sep 3, 2023	Action DetectionAction Spotting	CodeCode Available	1	5
Evolving Search Space for Neural Architecture Search	Nov 22, 2020	Knowledge DistillationNeural Architecture Search	CodeCode Available	1	5
Collaborative Distillation for Ultra-Resolution Universal Style Transfer	Mar 18, 2020	DecoderGPU	CodeCode Available	1	5
AD-KD: Attribution-Driven Knowledge Distillation for Language Model Compression	May 17, 2023	Knowledge DistillationLanguage Modeling	CodeCode Available	1	5
Ensembling and Knowledge Distilling of Large Sequence Taggers for Grammatical Error Correction	Mar 24, 2022	Grammatical Error CorrectionKnowledge Distillation	CodeCode Available	1	5

Show:10 25 50

← PrevPage 11 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified