Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1376–1400 of 4240 papers

Title	Date	Tasks	Status
Contextual Affinity Distillation for Image Anomaly Detection	Jul 6, 2023	Anomaly DetectionKnowledge Distillation	—Unverified
Feature Distillation is the Better Choice for Model-Heterogeneous Federated Learning	Jul 14, 2025	Federated LearningKnowledge Distillation	—Unverified
Feature Fusion and Knowledge-Distilled Multi-Modal Multi-Target Detection	May 31, 2025	Domain AdaptationKnowledge Distillation	—Unverified
Constructing Deep Spiking Neural Networks from Artificial Neural Networks with Knowledge Distillation	Apr 12, 2023	Knowledge Distillation	—Unverified
A Gift From Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning	Jul 1, 2017	Knowledge DistillationTransfer Learning	—Unverified
Inference Optimizations for Large Language Models: Effects, Challenges, and Practical Considerations	Aug 6, 2024	Knowledge DistillationNavigate	—Unverified
Feature Interaction Fusion Self-Distillation Network For CTR Prediction	Nov 12, 2024	Click-Through Rate PredictionKnowledge Distillation	—Unverified
MKD: a Multi-Task Knowledge Distillation Approach for Pretrained Language Models	Nov 9, 2019	Knowledge DistillationMulti-Task Learning	—Unverified
Conformer with dual-mode chunked attention for joint online and offline ASR	Jun 22, 2022	Knowledge Distillation	—Unverified
Agglomerating Large Vision Encoders via Distillation for VFSS Segmentation	Apr 3, 2025	Image SegmentationKnowledge Distillation	—Unverified
Configurable Holography: Towards Display and Scene Adaptation	Mar 24, 2024	Depth EstimationKnowledge Distillation	—Unverified
Confidence Preservation Property in Knowledge Distillation Abstractions	Jan 21, 2024	ClassificationKnowledge Distillation	—Unverified
AttentionLite: Towards Efficient Self-Attention Models for Vision	Dec 21, 2020	Knowledge Distillation	—Unverified
Ada-DQA: Adaptive Diverse Quality-aware Feature Acquisition for Video Quality Assessment	Aug 1, 2023	DiversityKnowledge Distillation	—Unverified
ACAM-KD: Adaptive and Cooperative Attention Masking for Knowledge Distillation	Mar 8, 2025	Autonomous Drivingfeature selection	—Unverified
Confidence Conditioned Knowledge Distillation	Jul 6, 2021	Knowledge Distillation	—Unverified
Confidence Based Bidirectional Global Context Aware Training Framework for Neural Machine Translation	Feb 28, 2022	DecoderKnowledge Distillation	—Unverified
Attention is all you need for boosting graph convolutional neural network	Mar 10, 2024	AllKnowledge Distillation	—Unverified
Confidence Attention and Generalization Enhanced Distillation for Continuous Video Domain Adaptation	Mar 18, 2023	Autonomous DrivingDomain Adaptation	—Unverified
Attention-guided Feature Distillation for Semantic Segmentation	Mar 8, 2024	Knowledge DistillationSegmentation	—Unverified
AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes	Jun 17, 2025	Knowledge DistillationTransfer Learning	—Unverified
Conditional Generative Data-free Knowledge Distillation	Dec 31, 2021	Conditional Image GenerationData-free Knowledge Distillation	—Unverified
Attention-Guided Answer Distillation for Machine Reading Comprehension	Aug 23, 2018	Knowledge DistillationMachine Reading Comprehension	—Unverified
Conditional Autoregressors are Interpretable Classifiers	Mar 31, 2022	Classificationimage-classification	—Unverified
A Generative Framework for Personalized Learning and Estimation: Theory, Algorithms, and Privacy	Jul 5, 2022	Federated LearningKnowledge Distillation	—Unverified

Show:10 25 50

← PrevPage 56 of 170Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified