Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–300 of 4240 papers

Title	Date	Tasks	Status	Hype	Score
AICSD: Adaptive Inter-Class Similarity Distillation for Semantic Segmentation	Aug 8, 2023	Knowledge DistillationSemantic Segmentation	CodeCode Available	1	5
Context-Aware Image Inpainting with Learned Semantic Priors	Jun 14, 2021	Image InpaintingKnowledge Distillation	CodeCode Available	1	5
A Deep Knowledge Distillation framework for EEG assisted enhancement of single-lead ECG based sleep staging	Dec 14, 2021	ECG based Sleep StagingEEG	CodeCode Available	1	5
Continual All-in-One Adverse Weather Removal with Knowledge Replay on a Unified Network Structure	Mar 12, 2024	AllContinual Learning	CodeCode Available	1	5
CaKDP: Category-aware Knowledge Distillation and Pruning Framework for Lightweight 3D Object Detection	Jan 1, 2024	3D Object DetectionKnowledge Distillation	CodeCode Available	1	5
Distilling Knowledge from Graph Convolutional Networks	Mar 23, 2020	Knowledge DistillationTransfer Learning	CodeCode Available	1	5
Distillation-Based Training for Multi-Exit Architectures	Oct 1, 2019	Knowledge Distillation	CodeCode Available	1	5
Distillation Matters: Empowering Sequential Recommenders to Match the Performance of Large Language Model	May 1, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available	1	5
Agree to Disagree: Adaptive Ensemble Knowledge Distillation in Gradient Space	Dec 1, 2020	DiversityKnowledge Distillation	CodeCode Available	1	5
DistilCSE: Effective Knowledge Distillation For Contrastive Sentence Embeddings	Dec 10, 2021	Contrastive LearningKnowledge Distillation	CodeCode Available	1	5
Discriminator-Cooperated Feature Map Distillation for GAN Compression	Dec 29, 2022	Image GenerationKnowledge Distillation	CodeCode Available	1	5
AGKD-BML: Defense Against Adversarial Attack by Attention Guided Knowledge Distillation and Bi-directional Metric Learning	Aug 13, 2021	Adversarial AttackAdversarial Robustness	CodeCode Available	1	5
Bridging Cross-task Protocol Inconsistency for Distillation in Dense Object Detection	Aug 28, 2023	Binary ClassificationClassification	CodeCode Available	1	5
DisCo: Distilled Student Models Co-training for Semi-supervised Text Mining	May 20, 2023	Extractive SummarizationKnowledge Distillation	CodeCode Available	1	5
Disentangle and Remerge: Interventional Knowledge Distillation for Few-Shot Object Detection from A Conditional Causal Perspective	Aug 26, 2022	Few-Shot LearningFew-Shot Object Detection	CodeCode Available	1	5
Aggretriever: A Simple Approach to Aggregate Textual Representations for Robust Dense Passage Retrieval	Jul 31, 2022	Knowledge DistillationLanguage Modeling	CodeCode Available	1	5
BPKD: Boundary Privileged Knowledge Distillation For Semantic Segmentation	Jun 13, 2023	Knowledge DistillationSegmentation	CodeCode Available	1	5
Breaking Modality Gap in RGBT Tracking: Coupled Knowledge Distillation	Oct 15, 2024	Knowledge DistillationRgb-T Tracking	CodeCode Available	1	5
Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings	Oct 23, 2022	Acoustic Unit DiscoveryContrastive Learning	CodeCode Available	1	5
AdaDistill: Adaptive Knowledge Distillation for Deep Face Recognition	Jul 1, 2024	Face RecognitionKnowledge Distillation	CodeCode Available	1	5
Bridge Past and Future: Overcoming Information Asymmetry in Incremental Object Detection	Jul 16, 2024	Knowledge Distillationobject-detection	CodeCode Available	1	5
Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with Foundation Models	May 15, 2023	3D Object DetectionImage Captioning	CodeCode Available	1	5
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	Oct 2, 2019	Hate Speech DetectionKnowledge Distillation	CodeCode Available	1	5
Digging into contrastive learning for robust depth estimation with diffusion models	Apr 15, 2024	Contrastive LearningDenoising	CodeCode Available	1	5
Boosting Light-Weight Depth Estimation Via Knowledge Distillation	May 13, 2021	Computational EfficiencyDepth Estimation	CodeCode Available	1	5
Extending global-local view alignment for self-supervised learning with remote sensing imagery	Mar 12, 2023	Change DetectionContrastive Learning	CodeCode Available	1	5
ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via α-β-Divergence	May 7, 2025	Knowledge Distillation	CodeCode Available	1	5
AgeFlow: Conditional Age Progression and Regression with Normalizing Flows	May 15, 2021	AttributeKnowledge Distillation	CodeCode Available	1	5
Boosting Multi-Label Image Classification with Complementary Parallel Self-Distillation	May 23, 2022	image-classificationImage Classification	CodeCode Available	1	5
DIOD: Self-Distillation Meets Object Discovery	Jan 1, 2024	Instance SegmentationKnowledge Distillation	CodeCode Available	1	5
Diffusion-Driven Data Replay: A Novel Approach to Combat Forgetting in Federated Class Continual Learning	Sep 2, 2024	Continual LearningContrastive Learning	CodeCode Available	1	5
Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation	Jun 1, 2020	Knowledge DistillationNeural Architecture Search	CodeCode Available	1	5
Blockwisely Supervised Neural Architecture Search with Knowledge Distillation	Nov 29, 2019	Knowledge DistillationNeural Architecture Search	CodeCode Available	1	5
A framework for benchmarking class-out-of-distribution detection and its application to ImageNet	Feb 23, 2023	BenchmarkingKnowledge Distillation	CodeCode Available	1	5
Dice Semimetric Losses: Optimizing the Dice Score with Soft Labels	Mar 28, 2023	Knowledge Distillation	CodeCode Available	1	5
DiGA: Distil to Generalize and then Adapt for Domain Adaptive Semantic Segmentation	Apr 5, 2023	Data AugmentationKnowledge Distillation	CodeCode Available	1	5
Directed Acyclic Graph Factorization Machines for CTR Prediction via Knowledge Distillation	Nov 21, 2022	Click-Through Rate PredictionKnowledge Distillation	CodeCode Available	1	5
Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic Hashing	Mar 10, 2024	Image RetrievalKnowledge Distillation	CodeCode Available	1	5
BKDSNN: Enhancing the Performance of Learning-based Spiking Neural Networks Training with Blurred Knowledge Distillation	Jul 12, 2024	Knowledge Distillation	CodeCode Available	1	5
Hybrid Inverted Index Is a Robust Accelerator for Dense Retrieval	Oct 11, 2022	Knowledge DistillationQuantization	CodeCode Available	1	5
Prototype-based Incremental Few-Shot Semantic Segmentation	Nov 30, 2020	Few-Shot Semantic SegmentationIncremental Learning	CodeCode Available	1	5
Black-Box Attacks on Sequential Recommenders via Data-Free Model Extraction	Sep 1, 2021	Data PoisoningKnowledge Distillation	CodeCode Available	1	5
Dialogue Chain-of-Thought Distillation for Commonsense-aware Conversational Agents	Oct 13, 2023	InformativenessKnowledge Distillation	CodeCode Available	1	5
A Fast Knowledge Distillation Framework for Visual Recognition	Dec 2, 2021	image-classificationImage Classification	CodeCode Available	1	5
BiLD: Bi-directional Logits Difference Loss for Large Language Model Distillation	Jun 19, 2024	Knowledge DistillationLanguage Modeling	CodeCode Available	1	5
Adversarially Robust Distillation	May 23, 2019	Adversarial RobustnessKnowledge Distillation	CodeCode Available	1	5
Black-box Few-shot Knowledge Distillation	Jul 25, 2022	image-classificationImage Classification	CodeCode Available	1	5
DGEKT: A Dual Graph Ensemble Learning Method for Knowledge Tracing	Nov 23, 2022	Ensemble LearningKnowledge Distillation	CodeCode Available	1	5
DialoKG: Knowledge-Structure Aware Task-Oriented Dialogue Generation	Apr 19, 2022	Dialogue GenerationKnowledge Distillation	CodeCode Available	1	5
Directed Acyclic Transformer for Non-Autoregressive Machine Translation	May 16, 2022	Knowledge DistillationMachine Translation	CodeCode Available	1	5

Show:10 25 50

← PrevPage 6 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified