Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 501–550 of 4240 papers

Title	Date	Tasks	Status	Hype
Advancing Pre-trained Teacher: Towards Robust Feature Discrepancy for Anomaly Detection	May 3, 2024	Anomaly DetectionAttribute	CodeCode Available	1
Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with Foundation Models	May 15, 2023	3D Object DetectionImage Captioning	CodeCode Available	1
Discriminator-Cooperated Feature Map Distillation for GAN Compression	Dec 29, 2022	Image GenerationKnowledge Distillation	CodeCode Available	1
Complementary Relation Contrastive Distillation	Mar 29, 2021	Knowledge DistillationRelation	CodeCode Available	1
Directed Acyclic Transformer for Non-Autoregressive Machine Translation	May 16, 2022	Knowledge DistillationMachine Translation	CodeCode Available	1
Learning Compatible Embeddings	Aug 4, 2021	Knowledge DistillationRetrieval	CodeCode Available	1
A Deep Knowledge Distillation framework for EEG assisted enhancement of single-lead ECG based sleep staging	Dec 14, 2021	ECG based Sleep StagingEEG	CodeCode Available	1
Learning Distilled Collaboration Graph for Multi-Agent Perception	Nov 1, 2021	3D Object DetectionKnowledge Distillation	CodeCode Available	1
Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification	Jan 6, 2020	General ClassificationKnowledge Distillation	CodeCode Available	1
Learning Generalizable Models for Vehicle Routing Problems via Knowledge Distillation	Oct 14, 2022	Knowledge Distillation	CodeCode Available	1
Channel-Aware Distillation Transformer for Depth Estimation on Nano Drones	Mar 18, 2023	Autonomous NavigationDepth Estimation	CodeCode Available	1
DisCo: Distilled Student Models Co-training for Semi-supervised Text Mining	May 20, 2023	Extractive SummarizationKnowledge Distillation	CodeCode Available	1
Disentangle and Remerge: Interventional Knowledge Distillation for Few-Shot Object Detection from A Conditional Causal Perspective	Aug 26, 2022	Few-Shot LearningFew-Shot Object Detection	CodeCode Available	1
Extending global-local view alignment for self-supervised learning with remote sensing imagery	Mar 12, 2023	Change DetectionContrastive Learning	CodeCode Available	1
Digging into contrastive learning for robust depth estimation with diffusion models	Apr 15, 2024	Contrastive LearningDenoising	CodeCode Available	1
DIOD: Self-Distillation Meets Object Discovery	Jan 1, 2024	Instance SegmentationKnowledge Distillation	CodeCode Available	1
Channel Distillation: Channel-Wise Attention for Knowledge Distillation	Jun 2, 2020	Knowledge Distillation	CodeCode Available	1
A Contrastive Distillation Approach for Incremental Semantic Segmentation in Aerial Images	Dec 7, 2021	image-classificationImage Classification	CodeCode Available	1
Channel Gating Neural Networks	May 29, 2018	Knowledge DistillationNetwork Pruning	CodeCode Available	1
Directed Acyclic Graph Factorization Machines for CTR Prediction via Knowledge Distillation	Nov 21, 2022	Click-Through Rate PredictionKnowledge Distillation	CodeCode Available	1
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	Oct 2, 2019	Hate Speech DetectionKnowledge Distillation	CodeCode Available	1
Distilling a Powerful Student Model via Online Knowledge Distillation	Mar 26, 2021	Knowledge Distillation	CodeCode Available	1
Dialogue Chain-of-Thought Distillation for Commonsense-aware Conversational Agents	Oct 13, 2023	InformativenessKnowledge Distillation	CodeCode Available	1
DGEKT: A Dual Graph Ensemble Learning Method for Knowledge Tracing	Nov 23, 2022	Ensemble LearningKnowledge Distillation	CodeCode Available	1
DialoKG: Knowledge-Structure Aware Task-Oriented Dialogue Generation	Apr 19, 2022	Dialogue GenerationKnowledge Distillation	CodeCode Available	1
DE-RRD: A Knowledge Distillation Framework for Recommender System	Dec 8, 2020	Knowledge DistillationModel Compression	CodeCode Available	1
A Dual-Space Framework for General Knowledge Distillation of Large Language Models	Apr 15, 2025	Code GenerationGeneral Knowledge	CodeCode Available	1
DFIL: Deepfake Incremental Learning by Exploiting Domain-invariant Forgery Clues	Sep 18, 2023	Continual LearningContrastive Learning	CodeCode Available	1
Dice Semimetric Losses: Optimizing the Dice Score with Soft Labels	Mar 28, 2023	Knowledge Distillation	CodeCode Available	1
Deliberation on Priors: Trustworthy Reasoning of Large Language Models on Knowledge Graphs	May 21, 2025	Knowledge DistillationKnowledge Graphs	CodeCode Available	1
Deliberated Domain Bridging for Domain Adaptive Semantic Segmentation	Sep 16, 2022	Domain AdaptationImage-to-Image Translation	CodeCode Available	1
Dense Interspecies Face Embedding	Nov 28, 2022	Image ManipulationInterspecies Facial Keypoint Transfer	CodeCode Available	1
CEN-HDR: Computationally Efficient neural Network for real-time High Dynamic Range imaging	Feb 10, 2023	Efficient Neural NetworkKnowledge Distillation	CodeCode Available	1
Deformation Flow Based Two-Stream Network for Lip Reading	Mar 12, 2020	Knowledge DistillationLipreading	CodeCode Available	1
Densely Guided Knowledge Distillation using Multiple Teacher Assistants	Sep 18, 2020	Knowledge DistillationModel Compression	CodeCode Available	1
Diffusion-Driven Data Replay: A Novel Approach to Combat Forgetting in Federated Class Continual Learning	Sep 2, 2024	Continual LearningContrastive Learning	CodeCode Available	1
DeepKD: A Deeply Decoupled and Denoised Knowledge Distillation Trainer	May 21, 2025	DenoisingKnowledge Distillation	CodeCode Available	1
Anti-Distillation Backdoor Attacks: Backdoors Can Really Survive in Knowledge Distillation	Oct 24, 2021	Backdoor AttackKnowledge Distillation	CodeCode Available	1
Deep Semi-supervised Knowledge Distillation for Overlapping Cervical Cell Instance Segmentation	Jul 21, 2020	Instance SegmentationKnowledge Distillation	CodeCode Available	1
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation	Jun 18, 2020	DecoderKnowledge Distillation	CodeCode Available	1
DeepAqua: Self-Supervised Semantic Segmentation of Wetland Surface Water Extent with SAR Images using Knowledge Distillation	May 2, 2023	Knowledge DistillationSemantic Segmentation	CodeCode Available	1
Deep Graph-level Anomaly Detection by Glocal Knowledge Distillation	Dec 19, 2021	Anomaly DetectionKnowledge Distillation	CodeCode Available	1
Deep Structured Instance Graph for Distilling Object Detectors	Sep 27, 2021	Instance SegmentationKnowledge Distillation	CodeCode Available	1
Decomposed Knowledge Distillation for Class-Incremental Semantic Segmentation	Oct 12, 2022	Class-Incremental Semantic SegmentationKnowledge Distillation	CodeCode Available	1
Data-Free Network Quantization With Adversarial Knowledge Distillation	May 8, 2020	Knowledge DistillationModel Compression	CodeCode Available	1
AD-KD: Attribution-Driven Knowledge Distillation for Language Model Compression	May 17, 2023	Knowledge DistillationLanguage Modeling	CodeCode Available	1
DA-Mamba: Domain Adaptive Hybrid Mamba-Transformer Based One-Stage Object Detection	Feb 16, 2025	Domain AdaptationKnowledge Distillation	CodeCode Available	1
Decoupled Kullback-Leibler Divergence Loss	May 23, 2023	Adversarial DefenseAdversarial Robustness	CodeCode Available	1
Data-Free Knowledge Distillation for Heterogeneous Federated Learning	May 20, 2021	Data-free Knowledge DistillationFederated Learning	CodeCode Available	1
Data-Free Class-Incremental Hand Gesture Recognition	Jan 1, 2023	class-incremental learningClass Incremental Learning	CodeCode Available	1

Show:10 25 50

← PrevPage 11 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified