Knowledge Distillation

Knowledge distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have higher knowledge capacity than small models, this capacity might not be fully utilized.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3851–3900 of 4240 papers

Title	Date	Tasks	Status
Knowledge Distillation of Domain-adapted LLMs for Question-Answering in Telecom	Apr 28, 2025	Domain AdaptationKnowledge Distillation	—Unverified
Knowledge Distillation of LLM for Automatic Scoring of Science Education Assessments	Dec 26, 2023	Knowledge DistillationMathematical Reasoning	—Unverified
Knowledge Distillation of Transformer-based Language Models Revisited	Jun 29, 2022	GPUKnowledge Distillation	—Unverified
Knowledge Distillation on Graphs: A Survey	Feb 1, 2023	Knowledge DistillationModel Compression	—Unverified
Knowledge Distillation on Spatial-Temporal Graph Convolutional Network for Traffic Prediction	Jan 22, 2024	Graph Neural NetworkKnowledge Distillation	—Unverified
Knowledge Distillation to Ensemble Global and Interpretable Prototype-Based Mammogram Classification Models	Sep 26, 2022	DiversityKnowledge Distillation	—Unverified
Knowledge Distillation Transfer Sets and their Impact on Downstream NLU Tasks	Oct 10, 2022	domain classificationintent-classification	—Unverified
Knowledge Distillation Under Ideal Joint Classifier Assumption	Apr 19, 2023	Domain AdaptationKnowledge Distillation	—Unverified
Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Data	Oct 24, 2024	Knowledge DistillationNatural Language Understanding	—Unverified
Knowledge distillation using unlabeled mismatched images	Mar 21, 2017	General Classificationimage-classification	—Unverified
Knowledge distillation via adaptive instance normalization	Mar 9, 2020	Knowledge DistillationModel Compression	—Unverified
Knowledge Distillation via Instance-level Sequence Learning	Jun 21, 2021	General KnowledgeKnowledge Distillation	—Unverified
Knowledge Distillation via Query Selection for Detection Transformer	Sep 10, 2024	Knowledge Distillationobject-detection	—Unverified
Knowledge distillation via softmax regression representation learning	Jan 1, 2021	Knowledge DistillationModel Compression	—Unverified
Knowledge Distillation via Token-level Relationship Graph	Jun 20, 2023	Knowledge DistillationTransfer Learning	—Unverified
Knowledge Distillation via Weighted Ensemble of Teaching Assistants	Jun 23, 2022	Ensemble LearningKnowledge Distillation	—Unverified
Knowledge Distillation vs. Pretraining from Scratch under a Fixed (Computation) Budget	Apr 30, 2024	Knowledge DistillationLanguage Modeling	—Unverified
Knowledge distillation with a class-aware loss for endoscopic disease detection	Jul 19, 2022	DiagnosticKnowledge Distillation	—Unverified
Knowledge Distillation with Adapted Weight	Jan 6, 2025	4kFairness	—Unverified
Knowledge Distillation with Adaptive Asymmetric Label Sharpening for Semi-supervised Fracture Detection in Chest X-rays	Dec 30, 2020	Fracture detectionKnowledge Distillation	—Unverified
Knowledge Distillation with BERT for Image Tag-Based Privacy Prediction	Sep 1, 2021	Knowledge DistillationTAG	—Unverified
Knowledge distillation with error-correcting transfer learning for wind power prediction	Apr 1, 2022	Knowledge DistillationTransfer Learning	—Unverified
Knowledge Distillation with Feature Maps for Image Classification	Dec 3, 2018	ClassificationGeneral Classification	—Unverified
Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution	Apr 3, 2024	Image Super-ResolutionKnowledge Distillation	—Unverified
Knowledge Distillation with Noisy Labels for Natural Language Understanding	Sep 21, 2021	Knowledge DistillationNatural Language Understanding	—Unverified
Representative Teacher Keys for Knowledge Distillation Model Compression Based on Attention Mechanism for Image Classification	Jun 26, 2022	GPUimage-classification	—Unverified
Knowledge distillation with Segment Anything (SAM) model for Planetary Geological Mapping	May 12, 2023	DecoderImage Segmentation	—Unverified
Knowledge Distillation with Training Wheels	Feb 24, 2025	Knowledge DistillationLanguage Modeling	—Unverified
Knowledge-Distilled Graph Neural Networks for Personalized Epileptic Seizure Detection	Apr 3, 2023	channel selectionEEG	—Unverified
EA-KD: Entropy-based Adaptive Knowledge Distillation	Nov 22, 2023	image-classificationImage Classification	—Unverified
Knowledge Consistency between Neural Networks and Beyond	Aug 5, 2019	Knowledge Distillation	—Unverified
Knowledge Migration Framework for Smart Contract Vulnerability Detection	Dec 15, 2024	Data-free Knowledge DistillationKnowledge Distillation	—Unverified
Knowledge Representing: Efficient, Sparse Representation of Prior Knowledge for Knowledge Distillation	Nov 13, 2019	Image ClassificationKnowledge Distillation	—Unverified
Knowledge-Spreader: Learning Semi-Supervised Facial Action Dynamics by Consistifying Knowledge Granularity	Jan 1, 2023	Knowledge Distillation	—Unverified
Knowledge Squeezed Adversarial Network Compression	Apr 10, 2019	Knowledge DistillationTransfer Learning	—Unverified
Knowledge Transfer with Visual Prompt in multi-modal Dialogue Understanding and Generation	Oct 1, 2022	Dialogue UnderstandingKnowledge Distillation	—Unverified
KnowRU: Knowledge Reusing via Knowledge Distillation in Multi-agent Reinforcement Learning	Mar 27, 2021	Deep Reinforcement LearningKnowledge Distillation	—Unverified
KnowSR: Knowledge Sharing among Homogeneous Agents in Multi-agent Reinforcement Learning	May 25, 2021	Deep Reinforcement LearningKnowledge Distillation	—Unverified
Know your tools well: Better and faster QA with synthetic examples	Oct 16, 2021	DiversityKnowledge Distillation	—Unverified
KOALA: Empirical Lessons Toward Memory-Efficient and Fast Diffusion Models for Text-to-Image Synthesis	Dec 7, 2023	DenoisingImage Generation	—Unverified
KoGNER: A Novel Framework for Knowledge Graph Distillation on Biomedical Named Entity Recognition	Mar 19, 2025	Knowledge DistillationKnowledge Graphs	—Unverified
KroneckerBERT: Learning Kronecker Decomposition for Pre-trained Language Models via Knowledge Distillation	Sep 13, 2021	Knowledge DistillationLanguage Modeling	—Unverified
KroneckerBERT: Significant Compression of Pre-trained Language Models Through Kronecker Decomposition and Knowledge Distillation	Jul 1, 2022	Knowledge DistillationLanguage Modeling	—Unverified
Kronecker Decomposition for GPT Compression	Oct 15, 2021	Knowledge DistillationLanguage Modeling	—Unverified
KTAN: Knowledge Transfer Adversarial Network	Oct 18, 2018	image-classificationImage Classification	—Unverified
Label Assignment Distillation for Object Detection	Sep 16, 2021	Knowledge DistillationObject	—Unverified
Label Augmentation via Time-based Knowledge Distillation for Financial Anomaly Detection	Jan 5, 2021	Anomaly DetectionKnowledge Distillation	—Unverified
Label-Context-Dependent Internal Language Model Estimation for CTC	Jun 6, 2025	Knowledge DistillationLanguage Modeling	—Unverified
Label Denoising with Large Ensembles of Heterogeneous Neural Networks	Sep 12, 2018	Data AugmentationDenoising	—Unverified
Label driven Knowledge Distillation for Federated Learning with non-IID Data	Sep 29, 2022	Federated LearningKnowledge Distillation	—Unverified

Show:10 25 50

← PrevPage 78 of 85Next →

All datasets ImageNet CIFAR-100 COCO (Common Objects in Context)COCO 2017 val PASCAL VOC KITTI

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ScaleKD (T:BEiT-L S:ViT-B/14)	Top-1 accuracy %	86.43	—	Unverified
2	ScaleKD (T:Swin-L S:ViT-B/16)	Top-1 accuracy %	85.53	—	Unverified
3	ScaleKD (T:Swin-L S:ViT-S/16)	Top-1 accuracy %	83.93	—	Unverified
4	ScaleKD (T:Swin-L S:Swin-T)	Top-1 accuracy %	83.8	—	Unverified
5	KD++(T: regnety-16GF S:ViT-B)	Top-1 accuracy %	83.6	—	Unverified
6	VkD (T:RegNety 160 S:DeiT-S)	Top-1 accuracy %	82.9	—	Unverified
7	SpectralKD (T:Swin-S S:Swin-T)	Top-1 accuracy %	82.7	—	Unverified
8	ScaleKD (T:Swin-L S:ResNet-50)	Top-1 accuracy %	82.55	—	Unverified
9	DiffKD (T:Swin-L S: Swin-T)	Top-1 accuracy %	82.5	—	Unverified
10	DIST (T: Swin-L S: Swin-T)	Top-1 accuracy %	82.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SRD (T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	79.86	—	Unverified
2	shufflenet-v2(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	78.76	—	Unverified
3	MV-MR (T: CLIP/ViT-B-16 S: resnet50)	Top-1 Accuracy (%)	78.6	—	Unverified
4	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	78.28	—	Unverified
5	resnet8x4 (T: resnet32x4 S: resnet8x4 [modified])	Top-1 Accuracy (%)	78.08	—	Unverified
6	ReviewKD++(T:resnet-32x4, S:shufflenet-v2)	Top-1 Accuracy (%)	77.93	—	Unverified
7	ReviewKD++(T:resnet-32x4, S:shufflenet-v1)	Top-1 Accuracy (%)	77.68	—	Unverified
8	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	77.5	—	Unverified
9	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.68	—	Unverified
10	resnet8x4 (T: resnet32x4 S: resnet8x4)	Top-1 Accuracy (%)	76.31	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	77.16	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	73.73	—	Unverified
3	ADLIK-Faster (T: Faster R-CNN vit-base S: Faster R-CNN deit-small)	box AP	47.6	—	Unverified
4	ADLIK-Mask (T: Mask R-CNN vit-base S: Mask R-CNN deit-small)	mask AP	42.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet50))	AP@0.5	61.8	—	Unverified
2	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(resnet18))	AP@0.5	57.96	—	Unverified
3	ReviewKD++(T: faster rcnn(resnet101), S:faster rcnn(mobilenet-v2))	AP@0.5	55.18	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSHFM (T: ResNet101 S: ResNet50)	mAP	93.17	—	Unverified
2	LSHFM (T: ResNet101 S: MobileNetV2)	mAP	90.14	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TIE-KD (T: Adabins S: MobileNetV2)	RMSE	2.43	—	Unverified